sparta.nn¶
-
class
sparta.nn.OperatorBase(raw_module: Module)[source]¶ -
Base class of sparse operators.
Examples
python# Create a dense softmax layer dense_softmax = torch.nn.Softmax # Create a mask mask = torch.rand((2048, 1024)) > 0.99 # Create a sparse softmax layer using the dense layer and the mask sparse_softmax = sparta.nn.SparseSoftmax(dense_softmax, mask=mask) # Tune the sparse softmax layer sparta.tune(sparse_softmax, sample_inputs=[torch.rand((2048, 1024))])- Parameters
raw_module (torch.nn.Module) – The corresponding dense operator.
-
build(params: Dict, sample_inputs: List, jit: bool = True)[source]¶ -
Build the sparse kernel using the specified implementation and configs.
- Parameters
params (Dict) – building parameters. It should be a valid sample of search space params[‘_name’] should be a valid kernel name in self._possible_implementations other key-value pairs in params are the parameters for self._possible_implementations[params[‘_name’]]
sample_inputs (List) – sample inputs for shape inference
jit (bool) – Determine whether to build the kernel using JIT mode.
-
get_search_space() TunableItemCfg[source]¶ -
Get the search space of the sparse operator.
- Returns
the search space of the sparse operator.
- Return type
-
set_search_space(search_space: Optional[TunableItemCfg] = None)[source]¶ -
Input a custom search space to override the default one before tuning.
Examples
python# Create a dense linear layer dense_linear = torch.nn.Linear(1024, 2048) # Create a mask weight_mask = torch.rand((2048, 1024)) > 0.99 # Create a sparse linear layer using the dense layer and the mask sparse_linear = sparta.nn.SparseLinear(dense_linear, weight_mask=weight_mask) # Set custom search space search_space_cfg = TunableItemCfg('choice', { 'openai': {}, 'sparta': { 'BLOCK_SIZE_M_VALUE': TunableItemCfg('choice', [32, 64]), 'BLOCK_SIZE_K_VALUE': TunableItemCfg('choice', [32, 64]), 'BLOCK_SIZE_N_VALUE': TunableItemCfg('choice', [32, 64]), 'THREAD_SIZE_M_VALUE': TunableItemCfg('choice', [4]), 'THREAD_SIZE_K_VALUE': TunableItemCfg('choice', [4]), 'THREAD_SIZE_N_VALUE': TunableItemCfg('choice', [4]), }, }) sparse_linear.set_search_space(search_space_cfg) # Tune the sparse linear layer sparta.tune(sparse_linear, sample_inputs=[torch.rand((512, 1024))])- Parameters
search_space (dict) – Key is the tuning algorithm, value is a dictionary whose keys are tunable parameters and values are lists of possible values.
-
tester(params: Dict, sample_inputs: List, jit: bool = False, weight_bk: float = 0.0) float[source]¶ -
Tester function for tuning. It will build the sparse kernel and run the forward function (or backward also), and return the measured time.
- Parameters
params (Dict) – building parameters. It should be a valid sample of search space
sample_inputs (List) – sample inputs for shape inference
jit (bool) – Determine whether to test the kernel using JIT mode.
weight_bk (float) – The weight of the backward time in the total time. If set to 0, the backward time is not counted.
- Returns
The performance (running latency) of the kernel.
- Return type
float
-
class
sparta.nn.SparseLinear(raw_module: Linear, input_mask: Optional[Tensor] = None, weight_mask: Optional[Tensor] = None, output_mask: Optional[Tensor] = None)[source]¶ -
Sparse linear operator.
Examples
python# Create a dense linear layer dense_linear = torch.nn.Linear(1024, 2048) # Create a mask weight_mask = torch.rand((2048, 1024)) > 0.99 # Create a sparse linear layer using the dense layer and the mask sparse_linear = sparta.nn.SparseLinear(dense_linear, weight_mask=weight_mask) # Tune the sparse linear layer sparta.tune(sparse_linear, sample_inputs=[torch.rand((512, 1024))])- Parameters
raw_module (torch.nn.Linear) – The corresponding dense linear operator.
input_mask (torch.Tensor) – The input mask tensor with shape (*, in_features). The kernel mode will be “sparse x dense => dense” if the input mask is set.
weight_mask (torch.Tensor) – The weight mask tensor with shape (out_features, in_features). The kernel mode will be “dense x sparse => dense” if the input mask is set.
output_mask (torch.Tensor) – The output mask tensor with shape (*, out_features). The kernel mode will be “dense x dense => sparse” if the input mask is set.
-
class
sparta.nn.SparseSoftmax(raw_module: Softmax, mask: Optional[Tensor] = None)[source]¶ -
Sparse softmax operator.
Examples
python# Create a dense softmax layer dense_softmax = torch.nn.Softmax # Create a mask mask = torch.rand((2048, 1024)) > 0.99 # Create a sparse softmax layer using the dense layer and the mask sparse_softmax = sparta.nn.SparseSoftmax(dense_softmax, mask=mask) # Tune the sparse softmax layer sparta.tune(sparse_softmax, sample_inputs=[torch.rand((2048, 1024))])- Parameters
raw_module (torch.nn.Softmax) – The corresponding dense softmax operator.
mask (torch.Tensor) – The mask with the same shape as the input tensor.
-
sparta.nn.tune(module: Module, sample_inputs: List[Tensor], algo: str = 'grid', max_trials: int = 9223372036854775807, tester_kw: Optional[Dict] = None, build_kw: Optional[Dict] = None, tuner_kw: Optional[Dict] = None, verbose: bool = False)¶ -
Find, tune and build all sparse operators in the model.
- Parameters
module (torch.nn.Module) – A PyTorch module that contains one or more sparse sub-modules.
sample_inputs (List[torch.Tensor]) – Sample input tensors to determine shape parameters.
algo – (str, optional): The algorithm to search the best parameters. Defaults to ‘grid’.
max_trials – (int, optional): The maximum number of trials to run. Defaults to sys.maxsize.
tester_kw – (Dict, optional): The keyword arguments for the tester. Defaults to None.
build_kw – (Dict, optional): The keyword arguments for the builder (after tuning). Defaults to None.
tuner_kw – (Dict, optional): The keyword arguments for the tuner. Defaults to None.