Sometimes, it is required to import additional metadata for indexed documents. In this article, I will describe import process based on Autonomy Lotus Notes Connector, but this procedure is applicable to other Autonomy connectors.
The easiest and most efficient way (yet not very well documented) to import data from basically any external source is to use ImportURL feature. It allows to import metadata using web page. I have used ASP.NET web page to import data from SQL database, but it can be used in similar way to connect to any backend system.
Your exported metadata must be located in head section in meta tags where name attribute should contain name of your field and content should contain exported data. When I was designing import, I have distinguished two types of fields:
- Single value - which does not require further processing on Notes Connector side
- Multi value - which were spitted on Notes Connector side and indexed by IDOL Server as multi value field
Following output is expected by Notes Connector ImportURL
<HTML> <HEAD> <TITLE> <META name=SingelValueField1 content=sampleContent1> <META name=SingelValueField2 content=sampleContent2> <META name=MultiValueField content=Value1,Value2,Value3> </HEAD> <BODY> </BODY> </HTML>
Let's assume our meta data import page will be queried like this:
http://some_address/someweb_app/MetaDataImport.aspx?ParameterOne=some_value&ParameterTwo=some_Value
Now, how this is done on Notes Connector side.
As a first step, I recommend to define field with url prefix, which can be used later on to form full url. This is useful, in case you need to change address of your import page.
FixedFieldName0=tmp_import_url FixedFieldValue0=http://some_address/someweb_app/ MetaDataImport.aspx?ParameterOne=
Let's assume we are importing following fields from Lotus Notes document, which later on, will be used to ask for metadata (this can be for ex document id in SQL database)
DreField0=tmp_param1_field NotesField0=param1_field DreField1=tmp_param2_field NotesField1=param2_field
Next, we should define fields for escaping parameters (in case we would like to use imported fields in original state for other purposes). It is good to set some default value in case Lotus field is not present. This way we will avoid crash of Notes Connector
FixedFieldName1=tmp_param1_escape_field FixedFieldValue1="null_value" FixedFieldName2=tmp_param2_escape_field FixedFieldValue2="null_value"
and copy our parameters there
ImportFieldOp1=FieldGlue ImportFieldOpApplyTo1=tmp_param1_escape_field ImportFieldOpParam1=Fnameparam1_field ImportFieldOp2=FieldGlue ImportFieldOpApplyTo2=tmp_param2_escape_field ImportFieldOpParam2=Fnameparam2_field
Then we should escape our input parameters and form output url
ImportFieldOp3=Escape ImportFieldopApplyTo3=tmp_param1_escape_field ImportFieldOp4=Escape ImportFieldopApplyTo4=tmp_param2_escape_field ImportFieldOp5=FieldGlue ImportFieldOpApplyTo5=tmp_import_url ImportFieldOpParam5=Fnametmp_import_url, Fnametmp_param1_escape_field, &ParameterTwo=,Fnametmp_param2_escape_field
Now it is time to import data from our web page
ImportFieldOp6=ImportURL ImportFieldOpApplyTo6=tmp_import_url ImportFieldOpParam6=1;
After this operation, our output IDX field will contain following fields
... #DREFILED SingelValueField1="sampleContent1" #DREFILED SingelValueField2="sampleContent2" #DREFILED MultiValueField="Value1,Value2,Value3" ...
Last thing we need to do is process our raw MultiValueField into real multi value field
ImportFieldOp7=Expand ImportFieldOpApplyTo7=MultiValueField ImportFieldOpParam7=,;multi_value_field
As an output we will get following IDX file. Please notice, that your IDX file will still contain original MultiValueField field. If you don't need it, you can filter it out on IDOL Server level.
... #DREFILED SingelValueField1="sampleContent1" #DREFILED SingelValueField2="sampleContent2" #DREFILED MultiValueField="Value1,Value2,Value3" #DREFIELD multi_value_field="Value1" #DREFIELD multi_value_field="Value2" #DREFIELD multi_value_field="Value3" ...
In log file, you should get something similar to this
... 12/01/2012 11:48:40 [0] IMPORTURL retrieving URL [http://some_address/someweb_app/MetaDataImport.aspx?ParameterOne=SomeValue&ParameterTwo=NextValue] 12/01/2012 11:48:40 [0] IMPORTURL returned 200 for http://some_address/someweb_app/MetaDataImport.aspx?ParameterOne=SomeValue&ParameterTwo=NextValue 12/01/2012 11:48:40 [0] IMPORTURL created temp file [c:\PathToNotesConnector\NotesFetch\Temp\IMPORTURL132636532092008988.tmp.AutnImportedURL.html] of 7481 bytes ...
Using ImportURL you are able to integrate with any data source and aggregate your data before putting them into IDOL Server. It has their limitations, but it is very simple and flexible
I hope this will save you couple of hours and/or couple of additional grey hairs :)
Please leave a comment if you have any ideas for improvements and good luck!
No comments:
Post a Comment