Data Utils

Necessary scripts to read orbits from different formats

Loading Data

load_orbit_data

 load_orbit_data (file_path:str, variable_name:Optional[str]=None,
                  dataset_path:Optional[str]=None)

Load orbit data from MATLAB .mat files, HDF5 .h5 files, or NumPy .npy files.

	Type	Default	Details
file_path	str		The path to the .mat, .h5, or .npy file.
variable_name	Optional	None	Name of the variable in the .mat file, optional.
dataset_path	Optional	None	Path to the dataset in the .h5 file, optional.
Returns	Any		The loaded orbit data.

source

load_memmap_array

 load_memmap_array (file_path:str, mode:str='c')

Load a .npy file as a memory-mapped array using numpy.memmap.

	Type	Default	Details
file_path	str		The path to the .npy file as a string.
mode	str	c	Mode for memory-mapping (‘r’, ‘r+’, ‘w+’, ‘c’).
Returns	memmap		Returns a memory-mapped array.

source

get_orbit_features

 get_orbit_features (file_path:str, variable_name:Optional[str]=None,
                     dataset_path:Optional[str]=None)

Load orbit feature data from a specified file and convert it to a DataFrame.

	Type	Default	Details
file_path	str		The path to the file (can be .mat, .h5, or .npy).
variable_name	Optional	None	Name of the variable in the .mat file, optional.
dataset_path	Optional	None	Path to the dataset in the .h5 file, optional.
Returns	DataFrame		DataFrame with detailed orbit features.

Save Data

source

save_data

 save_data (data:numpy.ndarray, file_name:str)

Save a numpy array to a file based on the file extension specified in file_name. Supports saving to HDF5 (.hdf5) or NumPy (.npy) file formats.

	Type	Details
data	ndarray	The numpy array data to save.
file_name	str	The name of the file to save the data in, including the extension.
Returns	None

Get Example Data

source

get_example_orbit_data

 get_example_orbit_data ()

Load example orbit data from a numpy file located in the example_data directory.

data = get_example_orbit_data()
data.shape

(400, 7, 100)

Order labels and array given target

source

order_labels_and_array_with_target

 order_labels_and_array_with_target (labels:numpy.ndarray,
                                     array:numpy.ndarray,
                                     target_label:str,
                                     place_at_end:bool=False)

Orders labels and array by placing entries with target_label either at start or end.

	Type	Default	Details
labels	ndarray		Array of labels to be ordered
array	ndarray		Array to be ordered according to labels
target_label	str		Label to order by
place_at_end	bool	False	Whether to place target label at end
Returns	tuple		Returns ordered labels and array

# Sample labels and a sample 3D array
labels = np.array(['apple', 'banana', 'apple', 'orange', 'banana', 'grape'])
array = np.array([[[1, 2], [3, 4]], 
                  [[5, 6], [7, 8]], 
                  [[9, 10], [11, 12]], 
                  [[13, 14], [15, 16]], 
                  [[17, 18], [19, 20]], 
                  [[21, 22], [23, 24]]])
target_label = 'apple'

ordered_labels, ordered_array = order_labels_and_array_with_target(labels, array, target_label)

print(ordered_labels)
print(ordered_array)

['apple' 'apple' 'banana' 'orange' 'banana' 'grape']
[[[ 1  2]
  [ 3  4]]

 [[ 9 10]
  [11 12]]

 [[ 5  6]
  [ 7  8]]

 [[13 14]
  [15 16]]

 [[17 18]
  [19 20]]

 [[21 22]
  [23 24]]]

Random Sampler

source

sample_orbits

 sample_orbits (orbit_data:numpy.ndarray, sample_spec:Union[dict,int],
                labels:Optional[numpy.ndarray]=None)

Randomly sample orbits from the provided dataset.

	Type	Default	Details
orbit_data	ndarray		Array of orbit data with shape (num_orbits, 6, num_time_points)
sample_spec	Union		Number of samples per class (dict) or total samples (int)
labels	Optional	None	Array of labels for each orbit
Returns	tuple

Random Discarder

source

discard_random_labels

 discard_random_labels (data:numpy.ndarray, labels:numpy.ndarray,
                        discard_labels:Union[List,Dict,int])

*Discards random or specified labels from the dataset.

Returns tuple of (discarded labels, filtered data, filtered labels).*

	Type	Details
data	ndarray	Dataset to filter
labels	ndarray	Labels corresponding to the data
discard_labels	Union	Labels to discard - list, dict or number
Returns	Tuple

Remove Duplicates preserve Order

source

remove_duplicates_preserve_order

 remove_duplicates_preserve_order (input_list:List)

Removes duplicate items from a list while preserving the original order.

	Type	Details
input_list	List	Input list that may contain duplicates
Returns	List	Returns list with duplicates removed while preserving order

Dataloaders

source

create_dataloaders

 create_dataloaders (scaled_data:torch.Tensor, val_split:float=0.2,
                     batch_size:int=32)

Creates train and validation dataloaders from input tensor data.

	Type	Default	Details
scaled_data	Tensor		Input tensor of scaled data
val_split	float	0.2	Fraction of data to use for validation
batch_size	int	32	Batch size for dataloaders
Returns	Tuple		Returns train and optional val dataloaders

Scaler

/usr/local/lib/python3.10/dist-packages/fastcore/docscrape.py:230: UserWarning: Unknown section Parameters:
  else: warn(msg)
/usr/local/lib/python3.10/dist-packages/fastcore/docscrape.py:230: UserWarning: Unknown section Attributes:
  else: warn(msg)

source

TSFeatureWiseScaler

 TSFeatureWiseScaler (feature_range:tuple=(0, 1))

Scales time series data feature-wise using PyTorch tensors.

source

TSGlobalScaler

 TSGlobalScaler ()

Scales time series data globally using PyTorch tensors.