2 September 2011

How to see Date as long number instead of random squares and other weird characters in Luke

Solution for Luke 3.3.0 (It works with other version  too).

Encoding in character set is really pain in the ass for junior software developer.
Luckly these days utf-8  (multibyte character encoding for Unicode) character set makes life so much easier.
However.I still hear lots of problems in various places.

Anyway....back to the topic.

I learning  Lucene now based on book "Lucene in Action" when they wrote tons of information, but example sucks.Really. (Review of this will come out soon)

For some dummy reason i decided add field with Date to Document:
doc.add(new NumericField("timestamp").setLongValue(Calendar.getInstance().getTimeInMillis())); 
but when i tried retrieve this field during Search result,then i saw a null instead of number  (why ? see my  other post: http://pastorcmentarny.blogspot.com/2011/09/why-numericfield-displayed-null-instead.html ) and i tried to sort out what is wrong (at this time i tried to understand how Boost Document and Field works and why getBoost() doesn't work correctly*), so i open Luke and i see  that my timestamp field is bunch of square and weird  characters instead of number.

 but then i discover that decoder for timestamp is set as string utf-8 when my datatype is long...
 So what you need to do is to set Decoder as  numeric-long and press Set button and then ... everything works :).

REMEMBER: lucene store everything as STRING, so if you want see long,then you need change decoder to your field in Luke .

(* about  Boost. If you touch this black magic,then result are  ... unexpected (they are correct,but it is not what you will expect to see).