7.1: Using Lucene Search with Custom Fields

I really like the performance and look of the new Search (Advanced OpenDiscover).

However, I added a custom field (accounts_cstm) for numeric IDs per account. The Default search can parse this field - how can I add it to the AOD/Lucene index?

Hi Klou,
Custom fields should be picked up by AOD.

If your account custom field is a text field with numbers in it this should behave fine.

However if the field type is a numeric type (such as integer, decimal e.t.c.) it wont be picked up. You canā€™t change this behaviour other than by changing the indexing code but this is a small change.

In modules/AOD_Index/AOD_Index.php around line 150 there is a switch statement at the end there is a bunch of fields which get ignored:

          
                case "address":
                case "bool":
                case "currency":
                case "date":
                case "datetimecombo":
                case "decimal":
                case "float":
                case "iframe":
                case "int":
                case "radioenum":
                case "relate":
                default:
                    break;

You can pull out each case that you want to index and move it to after the varchar case.

In your case you would remove


                case "int":

and move it next to the varchar case like:


                case "name":
                case "phone":
                case "html":
                case "text":
                case "url":
                case "varchar":
                case "int":

This will cause all int fields (in all indexed modules) to be indexed.

Hope this helps,
Jim

Thanks for the reply.

Unfortunately, it is already a text field - with the database field type nvarchar(255). So, not an int, and the field is not being indexed on two separate installations/databases.

This is the only custom field that is nvarchar = numeric string - all of the other ones seem to work.

Is there a function thatā€™s parsing the (numeric) string incorrectly?

The string isnā€™t parsed itā€™s just indexed as is. Are other fields in the module being indexed?

If no modules are being indexed then check that AOD is enabled in the settings and that the scheduler is running in Schedulers, in particular check the last run date.

If no field in the module is being indexed then check that the modules vardefs has an entry for ā€˜unified_searchā€™ and itā€™s set to true (this can be found in modules/YOURMODULE/vardefs.php or in the custom folders).

If itā€™s just that one field then itā€™s possible that this is a bug. If thatā€™s the case then could you please provide a copy of the vardef for the field that isnā€™t being indexed?

Finally please note that AOD searches are cached for 5 minutes or so. Waiting 5 minutes or slightly changing the search may produce different results.

Thanks,
Jim

Well, hereā€™s the thing: Itā€™s a custom field on the Accounts module. And actually, the vardefs isnā€™t as helpful.

~/custom/Accounts/Ext/Vardefs/vardefs.ext.php
~/custom/Extension/modules/Accounts/Ext/Vardefs/*.php


... <snip>

$dictionary['Account']['fields']['account_id_c']['unified_search'] = true;

...

$dictionary['Account']['fields']['account_id_c']['labelValue']='Account ID';
$dictionary['Account']['fields']['account_id_c']['type']='varchar';

I added the last line, as it looks like thereā€™s no type definition for custom fields (this one, or any that Iā€™ve added. Only ā€˜labelValueā€™).

I opened up the index in /modules/AOD_Index/Index/Index with Luke (v0.99; 4.0.0-ALPHA apparently didnā€™t like the Lucene version), and there are definitely some odd results.

The ā€œnameā€ records have a 1750 term count (as it parses Accounts/Contacts/Targets), but 4 of my custom fields (varchar) on the account modules only recorded 0,4, 17, 24 counts. If Iā€™m reading this right, this would be unique occurrences (which would explain the low numbers for fields such as Customer Class and Payment Terms), but the big fat 0 on my Account_ID is troubling.

Tracking further, hereā€™s the applicable entry in /cache/modules/Accounts/Accountvardefs.php


'account_id_c' =>
    array (
      'unified_search' => true,
      'labelValue' => 'Account ID',
      'required' => true,
      'source' => 'custom_fields',
      'name' => 'account_id_c',
      'vname' => 'LBL_ACCOUNT_ID',
      'type' => 'varchar',
      'massupdate' => 0,
      'default' => NULL,
      'no_default' => false,
      'comments' => '',
      'help' => '',
      'importable' => 'true',
      'duplicate_merge' => 'enabled',
      'duplicate_merge_dom_value' => 1,
      'audited' => true,
      'reportable' => true,
      'merge_filter' => 'disabled',
      'len' => 255,
      'size' => '20',
      'id' => 'Accounts_account_id_c',
      'custom_module' => 'Accounts',
    ),

Also Accounts snippet from /cache/modules/unified_search_modules.php


  'Accounts' =>
  array (
    'fields' =>
    array (
      'name' =>
      array (
        'query_type' => 'default',
      ),
      'phone' =>
      array (
        'query_type' => 'default',
        'db_field' =>
        array (
          0 => 'phone_office',
        ),
        'vname' => 'LBL_ANY_PHONE',
      ),
      'email' =>
      array (
        'query_type' => 'default',
        'operator' => 'subquery',
        'subquery' => 'SELECT eabr.bean_id FROM email_addr_bean_rel eabr JOIN email_addresses ea ON (ea.id = eabr.email_address_id) WHERE eabr.deleted=0 AND ea.email_address LIKE',
        'db_field' =>
        array (
          0 => 'id',
        ),
        'vname' => 'LBL_ANY_EMAIL',
      ),
      'account_id_c' =>
      array (
        'query_type' => 'default',
      ),
    ),
    'default' => true,
  ),

FWIW, I noticed that there arenā€™t any numerics in the as an indexed text field. Iā€™ll try to chase this down further on Monday. . .

FIXED.

The problem is that the default Lucene Analyzer does not process numbers. Iā€™ve changed this by setting the default to Common_TextNum in the same getDocumentForBean function mentioned above.


 public function getDocumentForBean(SugarBean $bean){
	// Changed default Analyzer to TextNum
        Zend_Search_Lucene_Analysis_Analyzer::setDefault(new Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum());


        if($bean->module_name == 'DocumentRevisions'){
            $document = $this->getDocumentForRevision($bean);
        }else{
            $document = array("error"=>false,"document"=>new Zend_Search_Lucene_Document());
        }
        if($document["error"]){
            return $document;
        }
        $document["document"]->addField(Zend_Search_Lucene_Field::UnIndexed("aod_id", $bean->module_name." ".$bean->id));
        $document["document"]->addField(Zend_Search_Lucene_Field::UnIndexed("record_id", $bean->id));
        $document["document"]->addField(Zend_Search_Lucene_Field::UnIndexed("record_module", $bean->module_name));
        foreach($GLOBALS['dictionary'][$bean->getObjectName()]['fields'] as $key => $field){
            switch($field['type']){
                case "enum":
                    $document["document"]->addField(Zend_Search_Lucene_Field::Keyword($key, strtolower($bean->$key)));
                    break;

                case "multienum":
                    $vals = unencodeMultienum($bean->$field);
                    $document["document"]->addField(Zend_Search_Lucene_Field::unStored($key, strtolower(implode(" ",$vals))));
                    break;
                case "name":
                case "phone":
                case "html":
                case "text":
                case "url":
                case "varchar":
                    if(property_exists($bean,$key)){
                        $val = strtolower($bean->$key);
                    }else{
                        $val = strtolower($bean->$key);
                    }
                    $field = Zend_Search_Lucene_Field::unStored($key, $val);
                    if($key == "name"){
                        $field->boost = 1.5;
	 		}
                    $document["document"]->addField($field);
                    break;
                case "address":
                case "bool":
                case "currency":
                case "date":
                case "datetimecombo":
                case "decimal":
                case "float":
                case "iframe":
                case "int":
                case "radioenum":
                case "relate":
                default:
                    break;
            }
        }

        return $document;
    }

http://framework.zend.com/manual/1.12/en/zend.search.lucene.extending.html

It looks like thereā€™s a TextNum_CaseInsenstive analyzer as well, so that could be used instead of some of the string-parsing within the function.

1 Like

Hi Klou,

Thanks for the time youā€™ve taken to track this down, weā€™ll make sure to add this fix to the next version of AOD.

Thanks,
Jim