ABOUT THE SYSTEM
Citation extraction system is oriented for information search in large structured texts, for example, in codes and laws. A set of citations relevant to user query is a result of a search. Here we introduce a citation as word-for-word extract from text, which has semantic completeness. Usually length of citation is a fraction of percent of original text length that substantially reduces intellectual expenditures on the analysis of the received information and on the decision-making.
Search is performed by keywords as in other systems. It is possible to search both in
document collection and in separate document. During the search the total number of obtained citations is counted and its document membership is displayed.
Schematic map of a document is formed for each document. A list of citations is displayed against a background of this map. Hidden text fragments not included in any citation are represented by graphic elements. It is possible to open any text fragment for examination ant to reduce it, if it is unnecessary.
For fast query specification system forms
prompting cloud, which includes set of words semantically connected with the query.
Comment: At the current stage of the project demo-prototype is developed. It should be noted that prototype isn’t a finished system, but it only demonstrates proposed approaches to the search. Collection of legal documents, including separate codices and laws of Russian Federation, is available now.
FORMING OF THE QUERY
Query is formed in a query line and may be both set of key words and phrase or sentence. Processing of the search query will be made on "Enter" press or "Search" click. Not according to the form of key word system takes into account all of its possible forms.
SEARCH IN DOCUMENT COLLECTION
To search in document collection it is necessary to choose documents, in which search will be performed, from hierarchical list on the tab sheet "Collection". All documents are selected by default.
A list of documents with indication of total number of obtained citations and number of citations in each document (in brackets) is a result of search available on the tab sheet "Search result". In the case of great number of citations it is reasonable to specify query with the help of prompting cloud or search line. To observe citations it is necessary to click on the reference, at the same time tab sheet "Document" will be activated. Citations are displayed against a background of document schematic map. It is possible to search in document with the help of search line or prompting cloud.
To perform second search it is necessary to go back to tab sheet "Collection".
SEARCH AND NAVIGATION IN THE DOCUMENT
It is possible to choose document in hierarchical list under the search line in advance, at the same time user will turned to tab sheet "Document". In this case area of search will be restricted by chosen document. Moreover, it is possible to choose document from a result of search on the collection.
Document will be displayed in the form of schematic map, on which all text fragments except citations are hidden. Limited number of the most relevant citations will be displayed on the first page in a case of great number of citations. It is possible to set the number of displayed citations with the help of dropdown list placed to the right of search line. It is possible to pass to more or less relevant citations by clicking on "right" or "left" arrow. Keywords are highlighted in text. Numbers (indices) of hidden fragments are displayed in gray. Hidden sentences are represented by square brackets […], hidden paragraphs are represented by braces {…}. Collapsed elements (articles, sentences and paragraphs) that contain relevant text fragments inside are colored orange. All hidden fragments are restored by mouse click. It is possible to hide fragment (and all its underlying elements, if presented) by clicking on its number or marker.
It's is possible to list relevant citations in the order from the top to the bottom of a document with the help of green arrows located on the right side from the query line. Those relevant citations are shown on the page by groups. Number of citations in one group can be selected by choosing it from the unrolling list located between green arrows. Using right/left arrows, user can read relevant citations in a manner of listing the pages of a book. Setting the size of a group from 1 to 3 elements, there will be no need to scroll the entire web-page up and down at all.
Search line in combination with prompting cloud is the most appropriate tool for search in large document.
PROMPTING CLOUD (for Russian texts only)
For fast query specification system forms prompting cloud, which includes set of words semantically connected with the query. Prompting cloud is a graphical widget placed between the search line and search results. Frequently meeting words are emphasized by larger font. Key words from a query are placed in center of cloud.
It is necessary to choose one of the words by mouse click to specify a query. This word will be attached to a query, if there is no such word in it. Otherwise chosen word will be removed from a query. System will perform a new search automatically, display its results and form a new prompting cloud.
Demo prototype - IE.