Data Utils

Necessary scripts to read orbits from different formats

Loading Data


source

load_orbit_data

 load_orbit_data (file_path:str, variable_name:Optional[str]=None,
                  dataset_path:Optional[str]=None)

Load orbit data from MATLAB .mat files, HDF5 .h5 files, or NumPy .npy files.

Type Default Details
file_path str The path to the .mat, .h5, or .npy file.
variable_name Optional None Name of the variable in the .mat file, optional.
dataset_path Optional None Path to the dataset in the .h5 file, optional.
Returns Any The loaded orbit data.

source

load_memmap_array

 load_memmap_array (file_path:str, mode:str='c')

Load a .npy file as a memory-mapped array using numpy.memmap.

Type Default Details
file_path str The path to the .npy file as a string.
mode str c Mode for memory-mapping (‘r’, ‘r+’, ‘w+’, ‘c’).
Returns memmap Returns a memory-mapped array.

source

get_orbit_features

 get_orbit_features (file_path:str, variable_name:Optional[str]=None,
                     dataset_path:Optional[str]=None)

Load orbit feature data from a specified file and convert it to a DataFrame.

Type Default Details
file_path str The path to the file (can be .mat, .h5, or .npy).
variable_name Optional None Name of the variable in the .mat file, optional.
dataset_path Optional None Path to the dataset in the .h5 file, optional.
Returns DataFrame DataFrame with detailed orbit features.

Save Data


source

save_data

 save_data (data:numpy.ndarray, file_name:str)

Save a numpy array to a file based on the file extension specified in file_name. Supports saving to HDF5 (.hdf5) or NumPy (.npy) file formats.

Type Details
data ndarray The numpy array data to save.
file_name str The name of the file to save the data in, including the extension.
Returns None

Get Example Data


source

get_example_orbit_data

 get_example_orbit_data ()

Load example orbit data from a numpy file located in the example_data directory.

data = get_example_orbit_data()
data.shape
(400, 7, 100)

Order labels and array given target


source

order_labels_and_array_with_target

 order_labels_and_array_with_target (labels:numpy.ndarray,
                                     array:numpy.ndarray,
                                     target_label:str,
                                     place_at_end:bool=False)

Orders labels and array by placing entries with target_label either at start or end.

Type Default Details
labels ndarray Array of labels to be ordered
array ndarray Array to be ordered according to labels
target_label str Label to order by
place_at_end bool False Whether to place target label at end
Returns tuple Returns ordered labels and array
# Sample labels and a sample 3D array
labels = np.array(['apple', 'banana', 'apple', 'orange', 'banana', 'grape'])
array = np.array([[[1, 2], [3, 4]], 
                  [[5, 6], [7, 8]], 
                  [[9, 10], [11, 12]], 
                  [[13, 14], [15, 16]], 
                  [[17, 18], [19, 20]], 
                  [[21, 22], [23, 24]]])
target_label = 'apple'

ordered_labels, ordered_array = order_labels_and_array_with_target(labels, array, target_label)

print(ordered_labels)
print(ordered_array)
['apple' 'apple' 'banana' 'orange' 'banana' 'grape']
[[[ 1  2]
  [ 3  4]]

 [[ 9 10]
  [11 12]]

 [[ 5  6]
  [ 7  8]]

 [[13 14]
  [15 16]]

 [[17 18]
  [19 20]]

 [[21 22]
  [23 24]]]

Random Sampler


source

sample_orbits

 sample_orbits (orbit_data:numpy.ndarray, sample_spec:Union[dict,int],
                labels:Optional[numpy.ndarray]=None)

Randomly sample orbits from the provided dataset.

Type Default Details
orbit_data ndarray Array of orbit data with shape (num_orbits, 6, num_time_points)
sample_spec Union Number of samples per class (dict) or total samples (int)
labels Optional None Array of labels for each orbit
Returns tuple

Random Discarder


source

discard_random_labels

 discard_random_labels (data:numpy.ndarray, labels:numpy.ndarray,
                        discard_labels:Union[List,Dict,int])

*Discards random or specified labels from the dataset.

Returns tuple of (discarded labels, filtered data, filtered labels).*

Type Details
data ndarray Dataset to filter
labels ndarray Labels corresponding to the data
discard_labels Union Labels to discard - list, dict or number
Returns Tuple

Remove Duplicates preserve Order


source

remove_duplicates_preserve_order

 remove_duplicates_preserve_order (input_list:List)

Removes duplicate items from a list while preserving the original order.

Type Details
input_list List Input list that may contain duplicates
Returns List Returns list with duplicates removed while preserving order

Dataloaders


source

create_dataloaders

 create_dataloaders (scaled_data:torch.Tensor, val_split:float=0.2,
                     batch_size:int=32)

Creates train and validation dataloaders from input tensor data.

Type Default Details
scaled_data Tensor Input tensor of scaled data
val_split float 0.2 Fraction of data to use for validation
batch_size int 32 Batch size for dataloaders
Returns Tuple Returns train and optional val dataloaders

Scaler

/usr/local/lib/python3.10/dist-packages/fastcore/docscrape.py:230: UserWarning: Unknown section Parameters:
  else: warn(msg)
/usr/local/lib/python3.10/dist-packages/fastcore/docscrape.py:230: UserWarning: Unknown section Attributes:
  else: warn(msg)

source

TSFeatureWiseScaler

 TSFeatureWiseScaler (feature_range:tuple=(0, 1))

Scales time series data feature-wise using PyTorch tensors.


source

TSGlobalScaler

 TSGlobalScaler ()

Scales time series data globally using PyTorch tensors.