PolyAnalyst is a convenient integrated environment for database exploration. It has a friendly object-oriented user interface. PolyAnalyst incorporates a set of powerful tools – exploration engines for intelligent data analysis.
The main engine, Core PolyAnalyst, finds exact form of multi-parameter functional dependencies in data, expressing them as mathematical formulae and/or structural algorithms including IF and FOR blocks, as well as other constructions. A unique feature of PolyAnalyst is its ability to discover empirical laws of a great variety of forms. In particular, it can work with structured data,
which are not necessarily represented as just sets of attribute values.
The second engine, ARNAVAC, detects presence of dependencies in data with the help of statistical methods, displaying results of exploration in a tabular form. ARNAVAC also separates the sub-population of points obeying the found dependence from a diffuse component considered to be noise or database errors.
Other engines use multiple regression technique, and data visualization as discovery methods.
Summary of PolyAnalyst features and characteristics:
- PolyAnalyst can work with databases consisting of up to 16000 records and up to 1000 attribute fields.
- It can export and import files in DBF and CSV (comma separated values) formats.
- Data from different sources can be combined using mechanism of keys and references.
- All contents of any specific database exploration task including data, graphs, results obtained by the exploration engines, rules, and laws are stored in a separate project file. This allows one to use a single copy of PolyAnalyst in many different research projects. Upon loading a project file you can contiue your work on the project exactly at the step at which you left it the previous time.
- Analyzed data may be a mixture of numerical, boolean or categorical values. PolyAnalyst can work with partially missing data.
- Data can be easily split into several subsets, or datasets which can be explored separately. New datasets can be created by splitting data according to various methods and criteria, or they can appear as a result of boolean operations (creating intersection, union, complement) on existing datasets.
- All elements of a project: datasets, rules, tables, currently running exploration engines, their reports, and so on, are represented as objects depicted by their icons. Such object-oriented user interface makes PolyAnalyst easy to learn and operate.
- Three main exploration engines of PolyAnalyst automatically extract information, or knowledge, from the data. The first engine, Core PolyAnalyst, discovers the exact form of dependencies in data, expressing them in the language of mathematical expressions, structural blocks, and other constructions, which are very intuitive and easy to understand. Core PolyAnalyst can discover laws of very broad nature. Discovered rules can be edited; they also can be combined with rules entered by the user or rules obtained by other PolyAnalyst exploration engines. Combination of various rules discovered by PolyAnalyst with prior user’s knowledge of the field generally provides elaborate and very effective models. The second exploration engine, ARNAVAC, detects presence of functional dependencies in data and finds data fields obeying these dependencies. It also determines the accuracy and significance of the dependence found, as well as a subset of data points that do not obey the dependence (possible noise or database errors). ARNAVAC represents the discovered dependence in a tabular form, revealing its general structure. The third exploration engine provides multiple regression with automated selection of independent variables. It is a more traditional but still a very useful tool.
- One additional data exploration method featured by PolyAnalyst is based on its data visualization capabilities. The user can create various graphs, manipulate points and datasets with the help of these graphs, modify data in the graphs using entered and discovered rules, or analyze graphically multi-dimensional models varying their parameters with the help of sliders. For example, this manual data analysis is very helpful in difficult cases, when complex derivative parameters of original data structures have to be used as independent variables for obtaining a more precise empirical law by Core PolyAnalyst.
- Exploration engines can work concurrently.
- PolyAnalyst maintains strict control of significance of reported results.
- PolyAnalyst performs data exploration automatically, while keeping interaction with the user to a minimum. This feature allows even an inexperienced user with no mathematical or statistical training to exercise the full power of PolyAnalyst exploration engines.