Data.olllo User Guide
Data.olllo Data Assistant is trying to be the best data software in the world, currently with three cores (P Core, V Core, X Core) to publish, supporting various types of data files.
1. Basic Software Information:
- Speed: Fast, very fast, extremely fast.
- Core:
P Core
: The classic core, optimized for datasets up to tens of millions.V Core
: Tailored for datasets in the billions, excelling with large HDFS files.X Core
: Enhanced classic core, designed for terabyte-scale data, supporting GPU acceleration and multi-threading.
- Supported Data File Types:
P Core
andX Core
- Open files supported:
csv
,xlsx
,xls
,dbf
,json
,html
,xml
,clipboard
,h5
,hdf5
,hdf
,feather
,parquet
,dta
,sav
,pkl
,sas7bdat
,xpt
,sas
,spss
,table
,gbq
,fwf
,orc
; - Save files supported:
csv
,xls
,xlsx
,pkl
,clipboard
,json
,html
,xml
,latex
,h5
,hdf5
,hdf
,feather
,parquet
,orc
,dta
.
- Open files supported:
X Core
- Open files supported:
csv
,arrow
,hdf5
,parquet
,feather
,fits
,ascii
,json
,xlsx
,xls
,dbf
; - Save files supported:
arrow
,hdf5
,parquet
,feather
,fits
,csv
,xls
,xlsx
,json
.
- Open files supported:
- Files Types between Cores: In theory, P, V, X cores can open files interchangeably. You can first open a file with one core, then save it as CSV or similar for the other core to use.*
- Data File Size: Large, very large, and extremely large files are supported. The P Core of Data.olllo Data Assistant can handle tens of millions to hundreds of millions of rows of data. If the data size is too large to open, you can switch to the V Core or X Core, which supports extremely large data files. As long as you have enough disk space, you can handle any size of file.
- Supported File Encodings: What encoding do you need? :) Manual input of any encoding is supported, and there is also an automatic encoding detection feature.
- Toolbox Online Update: Supports online updates for the toolbox and super options in the Super Features section.
2. Basic Software Features:
- Open Various File Types: Supports batch opening of multiple files of the same type or with the same encoding.
- Concatenate Files While Opening: V Core supports concatenating multiple files into one upon opening (combining files with the same fields). After opening multiple files, P and X Core also allows you to merge the files into one.
- Save as Various File Types: See the supported file types for more details.
- Split and Save Files: You can split and save files into multiple smaller files either by size or by keywords.
- File Splitting: Since the P Core cannot handle files as large as the V Core, it supports splitting files without opening them.
- Display Rows and Columns: You can choose to display a specific number of rows and columns for easier viewing.
- Close Files: Allows closing files without saving, and supports batch closing of multiple files.
3. Software Data Processing Features:
- Flexible Selection Set: Select All, Deselect, Row Selection, Column Selection.
- Basic Information: Displays basic information about the dataset.
- Quick Statistics: Quickly calculates the maximum, minimum, average, etc., of the dataset.
- Quick Sorting: Supports sorting by one or multiple columns, in the order of the columns. If you need to change the order, use the "Swap Two Columns" function first.
- Swap Two Columns Function: Select two columns to swap their order.
- Delete Rows, Delete Columns: Delete the selected rows or columns.
- Modify Column Name, Modify Column Type (String, Integer, Decimal): Allows renaming columns or changing their data type.
- Edit a Row or Insert a Row: Edit or insert a row.
- Edit a Column or Add a New Column: Edit a column or add a new one (supports operations like +, -, *, / for numerical types, and + for string types—try it to see how it works).
- Filter Records (Rows): Supports filtering rules for data or string types, such as "Equal to," "Not Equal to," "Greater than," "Less than," "Greater than or equal to," and "Less than or equal to."
- Concatenate Files: Vertically concatenate multiple files.
- Merge Files: Horizontally merge two files by keyword. For P Core, the keyword names must be the same (if different, use the "Modify Column Name" function to make them the same). For V Core, no keyword name requirement, but it is not allowed to merge two tables with fields of the same name (use "Modify Column Name" to change column names first).
- Data Pivot: Similar to a pivot table, select the columns for rows and columns, specify the content and type (total, average, etc., can be shown together), and pivot. You can also select columns without selecting rows and still perform the pivot.
- Remove Duplicates: Identifies and removes duplicate data. You can choose to keep or discard duplicates, and decide to keep the first or last occurrence.
- Find Duplicates: Identifies duplicate data entries.
- Group Statistics: Provides total, average, maximum, and minimum statistics grouped by keyword.
4. Software Super Data Processing Features
The must-mention super features include Super Filter, Super Extraction, and Super Replace. The term "Super" means just that—super! It supports not only regular strings but also rule descriptions (regular expressions). For regular expressions, we provide a graphical tool to visualize the expression, making it easier for those who want to use regular expressions without learning them.
With these super features, you can handle tasks like searching, filtering, replacing, matching, selecting content, and more!
- Super Filter: Allows you to filter selected columns, with the filtered result stored in a new field in the same data table. For example, if you want to check for phone numbers in a column, select the phone number regular expression, confirm it, and the result will be saved. If the result is True, it means there is a phone number; if False, there is not. You can then extract the filtered rows by using the "Filter Records" feature to select True or False.
- Super Extraction: Allows you to extract content from selected columns, with the extracted result stored in a new field in the same data table. For example, if you want to extract phone numbers from a column, select the phone number regular expression, confirm it, and the result will be extracted into a new column (you can name the column).
In Super Extraction, you can also use content as a delimiter. For instance, if you want to split a column using numbers (where the numbers are different), select the delimiter content within Super Extraction, choose the number regular expression, confirm it, and you'll get a new column with the split content.
- Super Replace: Allows you to replace matched content with your desired text. For example, if you want to replace a phone number with "139xxxxxx", select the column, choose the phone number regular expression, enter "139xxxxxx" as the replacement, confirm it, and the result will be stored in a new column.
The Super Features also include a graphical regular expression generator with examples to help you understand and try out the tool. Regex.olllo, the regular expression visualizer, is also part of the Olllo series and a fun application to try.
5. Software Command Mode
The software is developed using Python and supports Python statements. The data table is represented as df. If you know programming, you can directly write and execute statements on it. For example, df.sample(5)
means to get 5 sample rows randomly. By looking at the examples provided, you'll quickly understand how to use it.
The command mode can do anything because it's essentially programming.
6. Tools Map and Regular Expression Map
The tools map and the regular expression map will be updated periodically, requiring manual updates. With each update, more features will be added.
7. Overall Function Layout
8. UI Themes
Data.olllo offers two distinct user interface themes to suit your preference and enhance your experience. The Classic UI offers a simple, familiar layout for straightforward data management. For a more modern and customizable experience, the Advanced UI provides both Light and Dark themes, with scaling options to adapt to your display needs. Whether you prefer a classic look or a sleek, scalable design, Data.olllo gives you the flexibility to choose your ideal interface.
- Advanced UI of Dark
- Advanced UI of Light
- Classic UI
Version Details
Version Available: Above 7.0