xref: /Lucene/lucene/JRE_VERSION_MIGRATION.md (revision 08c03566648c0b024b8160869b3d694c3cebaabd)
13bedc087SDawid Weiss<!--
23bedc087SDawid Weiss    Licensed to the Apache Software Foundation (ASF) under one or more
33bedc087SDawid Weiss    contributor license agreements.  See the NOTICE file distributed with
43bedc087SDawid Weiss    this work for additional information regarding copyright ownership.
53bedc087SDawid Weiss    The ASF licenses this file to You under the Apache License, Version 2.0
63bedc087SDawid Weiss    the "License"); you may not use this file except in compliance with
73bedc087SDawid Weiss    the License.  You may obtain a copy of the License at
83bedc087SDawid Weiss
93bedc087SDawid Weiss        http://www.apache.org/licenses/LICENSE-2.0
103bedc087SDawid Weiss
113bedc087SDawid Weiss    Unless required by applicable law or agreed to in writing, software
123bedc087SDawid Weiss    distributed under the License is distributed on an "AS IS" BASIS,
133bedc087SDawid Weiss    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
143bedc087SDawid Weiss    See the License for the specific language governing permissions and
153bedc087SDawid Weiss    limitations under the License.
163bedc087SDawid Weiss -->
173bedc087SDawid Weiss
18c7697b08STomoko Uchida# JRE Version Migration Guide
19c7697b08STomoko Uchida
20c7697b08STomoko UchidaIf possible, use the same JRE major version at both index and search time.
21c7697b08STomoko UchidaWhen upgrading to a different JRE major version, consider re-indexing.
22c7697b08STomoko Uchida
23*08c03566SDawid WeissDifferent Java versions may implement different versions of Unicode,
24c7697b08STomoko Uchidawhich will change the way some parts of Lucene treat your text.
25c7697b08STomoko Uchida
26*08c03566SDawid WeissAn (outdated) example: with Java 1.4, `LetterTokenizer` will split around the
27*08c03566SDawid Weisscharacter U+02C6, but with Java 5 it will not. This is because Java 1.4
28*08c03566SDawid Weissimplements Unicode 3, but Java 5 implements Unicode 4.
29c7697b08STomoko Uchida
30*08c03566SDawid WeissThe version of Unicode supported by Java is listed in the documentation
31*08c03566SDawid Weissof java.lang.Character class. For reference, Java versions after Java 11
32*08c03566SDawid Weisssupport the following Unicode versions:
33c7697b08STomoko Uchida
34*08c03566SDawid Weiss * Java 11, Unicode 10.0
35*08c03566SDawid Weiss * Java 12, Unicode 11.0
36*08c03566SDawid Weiss * Java 13, Unicode 12.1
37*08c03566SDawid Weiss * Java 15, Unicode 13.0
38*08c03566SDawid Weiss * Java 16, Unicode 13.0
39*08c03566SDawid Weiss * Java 17, Unicode 13.0
40c7697b08STomoko Uchida
41c7697b08STomoko UchidaIn general, whether you need to re-index largely depends upon the data that
42c7697b08STomoko Uchidayou are searching, and what was changed in any given Unicode version. For example,
43c7697b08STomoko Uchidaif you are completely sure your content is limited to the "Basic Latin" range
44c7697b08STomoko Uchidaof Unicode, you can safely ignore this.
45