Table#
- class swordfish._swordfishcpp.Table#
Represents a tabular data structure.
In tables, data is logically organized in a row-and-column format. Each row represents a unique record, and each column represents a field in the record. Provides comprehensive functionality for data manipulation and analysis.
- classmethod from_pandas(data, *, types=None)#
Creates a Table instance from a Pandas DataFrame.
- Parameters:
data (pd.DataFrame) – The Pandas DataFrame to convert.
types (Dict[str, DataType], optional) – Column type mappings where keys are column names and values are DataType enumerations. If None, types are inferred automatically.
- Returns:
A new Table instance containing the DataFrame data.
- Return type:
- to_pandas()#
Converts this Table to a Pandas DataFrame.
- Returns:
A DataFrame with equivalent data and column types automatically mapped to compatible Pandas dtypes.
- Return type:
pd.DataFrame
- property types: Dict[str, DataType]#
Returns the data types of all table columns.
- Returns:
Mapping of column names to their corresponding DataType values.
- Return type:
Dict[str, DataType]
- property name: str#
Returns the table’s name.
- Returns:
The assigned name of this table.
- Return type:
str
Indicates whether this table is shared across sessions.
- Returns:
True if shared, False if private to current session.
- Return type:
bool
Makes this table accessible across sessions with the specified name.
- Parameters:
name (str) – Global name for the shared table.
readonly (bool, optional) – Whether to restrict the table to read-only access. Defaults to False.
- Returns:
This table instance for method chaining.
- Return type:
Self
- schema()#
Returns the table’s schema information.
- Returns:
Column names mapped to their respective data types.
- Return type:
- head(n=)#
Returns the first n rows of the table.
- tail(n=)#
Retrieves the last n rows of the table.
- count()#
Counts the number of rows in the table.
- Returns:
The number of rows in the table.
- Return type:
- summary(interpolation=, characteristic=, percentile=, precision=, partitionSampling=)#
Computes comprehensive summary statistics for numeric columns.
- Parameters:
interpolation (Constant, optional) – Percentile interpolation method. Available options: “linear” (default), “nearest”, “lower”, “higher”, “midpoint”.
characteristic (Constant, optional) – Statistics to calculate. Options: “avg” (mean), “std” (standard deviation). Default computes both [“avg”, “std”].
percentile (Constant, optional) – List of percentile values (0-1) to compute. Default is [0.25, 0.50, 0.75] for 25th, 50th, and 75th percentiles.
precision (Constant, optional) – Convergence threshold for iterative calculations. Recommended range: [1e-3, 1e-9]. Default: 1e-3.
partitionSampling (Constant, optional) – For partitioned tables, either the number of partitions to sample (integer) or sampling ratio (0-1]. No effect on non-partitioned tables.
- Returns:
Summary table with min, max, count, mean, std dev, and percentiles for each numeric column.
- Return type:
- sortBy_(sortColumns, sortDirections=)#
Sorts the table in-place by specified columns and directions.
For partitioned tables, sorting occurs within each partition independently. Parallel processing is used when localExecutors > 0 configuration is enabled.