Table#

class swordfish._swordfishcpp.Table#

Represents a tabular data structure.

In tables, data is logically organized in a row-and-column format. Each row represents a unique record, and each column represents a field in the record. Provides comprehensive functionality for data manipulation and analysis.

classmethod from_pandas(data, *, types=None)#

Creates a Table instance from a Pandas DataFrame.

Parameters:
  • data (pd.DataFrame) – The Pandas DataFrame to convert.

  • types (Dict[str, DataType], optional) – Column type mappings where keys are column names and values are DataType enumerations. If None, types are inferred automatically.

Returns:

A new Table instance containing the DataFrame data.

Return type:

Table

to_pandas()#

Converts this Table to a Pandas DataFrame.

Returns:

A DataFrame with equivalent data and column types automatically mapped to compatible Pandas dtypes.

Return type:

pd.DataFrame

property types: Dict[str, DataType]#

Returns the data types of all table columns.

Returns:

Mapping of column names to their corresponding DataType values.

Return type:

Dict[str, DataType]

property name: str#

Returns the table’s name.

Returns:

The assigned name of this table.

Return type:

str

property is_shared: bool#

Indicates whether this table is shared across sessions.

Returns:

True if shared, False if private to current session.

Return type:

bool

share(name, readonly=False)#

Makes this table accessible across sessions with the specified name.

Parameters:
  • name (str) – Global name for the shared table.

  • readonly (bool, optional) – Whether to restrict the table to read-only access. Defaults to False.

Returns:

This table instance for method chaining.

Return type:

Self

schema()#

Returns the table’s schema information.

Returns:

Column names mapped to their respective data types.

Return type:

Dictionary

head(n=)#

Returns the first n rows of the table.

Parameters:

n (Constant, optional) – Number of rows to return. Uses default if not specified.

Returns:

A table containing the first n rows.

Return type:

Constant

tail(n=)#

Retrieves the last n rows of the table.

Parameters:

n (Constant, optional) – The number of rows to retrieve. Defaults to DFLT.

Returns:

A subset of the table containing the last n rows.

Return type:

Constant

count()#

Counts the number of rows in the table.

Returns:

The number of rows in the table.

Return type:

Constant

summary(interpolation=, characteristic=, percentile=, precision=, partitionSampling=)#

Computes comprehensive summary statistics for numeric columns.

Parameters:
  • interpolation (Constant, optional) – Percentile interpolation method. Available options: “linear” (default), “nearest”, “lower”, “higher”, “midpoint”.

  • characteristic (Constant, optional) – Statistics to calculate. Options: “avg” (mean), “std” (standard deviation). Default computes both [“avg”, “std”].

  • percentile (Constant, optional) – List of percentile values (0-1) to compute. Default is [0.25, 0.50, 0.75] for 25th, 50th, and 75th percentiles.

  • precision (Constant, optional) – Convergence threshold for iterative calculations. Recommended range: [1e-3, 1e-9]. Default: 1e-3.

  • partitionSampling (Constant, optional) – For partitioned tables, either the number of partitions to sample (integer) or sampling ratio (0-1]. No effect on non-partitioned tables.

Returns:

Summary table with min, max, count, mean, std dev, and percentiles for each numeric column.

Return type:

Constant

sortBy_(sortColumns, sortDirections=)#

Sorts the table in-place by specified columns and directions.

For partitioned tables, sorting occurs within each partition independently. Parallel processing is used when localExecutors > 0 configuration is enabled.

Parameters:
  • sortColumns (Constant) – Column name(s) to sort by. Accepts string, list of strings, or meta code expression.

  • sortDirections (Constant, optional) – Sort order for each column. True/1 for ascending (default), False/0 for descending.

Returns:

The sorted table.

Return type:

Constant