nlpstack.integrations.torch.util module#

nlpstack.integrations.torch.util.add_positional_features(tensor, min_timescale=1.0, max_timescale=10000.0)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • tensor (TensorType) –

  • min_timescale (float) –

  • max_timescale (float) –

nlpstack.integrations.torch.util.batched_index_select(target, indices, flattened_indices=None)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • target (TensorType) –

  • indices (LongTensor) –

  • flattened_indices (LongTensor | None) –

nlpstack.integrations.torch.util.batched_span_select(target, spans)[source]#
Return type:

Tuple[TypeVar(TensorType, bound= Tensor), BoolTensor]

Parameters:
  • target (TensorType) –

  • spans (LongTensor) –

nlpstack.integrations.torch.util.combine_tensors(combination, tensors)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • combination (str) –

  • tensors (Sequence[TensorType]) –

nlpstack.integrations.torch.util.convert_to_toeplitz(inputs)[source]#
Return type:

Tensor

Parameters:

inputs (Tensor) –

nlpstack.integrations.torch.util.flatten_and_batch_shift_indices(indices, sequence_length)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • indices (TensorType) –

  • sequence_length (int) –

nlpstack.integrations.torch.util.fold(tensor, max_length)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • tensor (TensorType) –

  • max_length (int) –

nlpstack.integrations.torch.util.get_device_of(tensor)[source]#

Returns the device of the tensor.

Return type:

int

Parameters:

tensor (Tensor) –

nlpstack.integrations.torch.util.get_mask_from_text(text)[source]#
Parameters:

text (Mapping[str, Mapping[str, Tensor]]) – Mapping[str, Mapping[str, torch.nn.LongTensor]]

Return type:

BoolTensor

Returns:

torch.BoolTensor

nlpstack.integrations.torch.util.get_range_vector(size, device)[source]#

Returns a range vector with the desired size, starting at 0. The CUDA implementation is meant to avoid copy data from CPU to GPU.

Return type:

Tensor

Parameters:
  • size (int) –

  • device (int) –

nlpstack.integrations.torch.util.get_token_ids_from_text(text)[source]#
Return type:

LongTensor

Parameters:

text (Mapping[str, Mapping[str, Tensor]]) –

nlpstack.integrations.torch.util.info_value_of_dtype(dtype)[source]#

Returns the finfo or iinfo object of a given PyTorch data type. Does not allow torch.bool.

Return type:

Union[finfo, iinfo]

Parameters:

dtype (dtype) –

nlpstack.integrations.torch.util.int_to_device(device)[source]#
Return type:

device

Parameters:

device (int | device) –

nlpstack.integrations.torch.util.logsumexp(tensor, dim=-1, keepdim=False)[source]#
Return type:

Tensor

Parameters:
  • tensor (Tensor) –

  • dim (int) –

  • keepdim (bool) –

nlpstack.integrations.torch.util.masked_max(vector, mask, dim, keepdim=False)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • vector (TensorType) –

  • mask (BoolTensor) –

  • dim (int) –

  • keepdim (bool) –

nlpstack.integrations.torch.util.masked_mean(vector, mask, dim, keepdim=False)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • vector (TensorType) –

  • mask (BoolTensor) –

  • dim (int) –

  • keepdim (bool) –

nlpstack.integrations.torch.util.masked_pool(inputs, mask=None, method='mean', dim=1, keepdim=False, window_size=None)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • inputs (TensorType) –

  • mask (BoolTensor | None) –

  • method (Literal['mean', 'max', 'sum', 'hier']) –

  • dim (int) –

  • keepdim (bool) –

  • window_size (int | None) –

nlpstack.integrations.torch.util.masked_softmax(vector, mask, dim=-1, memory_efficient=False)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • vector (TensorType) –

  • mask (BoolTensor) –

  • dim (int) –

  • memory_efficient (bool) –

nlpstack.integrations.torch.util.max_value_of_dtype(dtype)[source]#

Returns the maximum value of a given PyTorch data type. Does not allow torch.bool.

Return type:

Union[float, int]

Parameters:

dtype (dtype) –

nlpstack.integrations.torch.util.min_value_of_dtype(dtype)[source]#

Returns the minimum value of a given PyTorch data type. Does not allow torch.bool.

Return type:

Union[float, int]

Parameters:

dtype (dtype) –

nlpstack.integrations.torch.util.move_to_device(obj, device)[source]#
Return type:

TypeVar(T)

Parameters:
  • obj (T) –

  • device (int | device) –

nlpstack.integrations.torch.util.replace_masked_values(tensor, mask, replace_with)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • tensor (TensorType) –

  • mask (BoolTensor) –

  • replace_with (float) –

nlpstack.integrations.torch.util.sequence_cross_entropy_with_logits(logits, targets, weights, average='batch', label_smoothing=None, gamma=None, alpha=None)[source]#
Return type:

FloatTensor

Parameters:
  • logits (FloatTensor) –

  • targets (LongTensor) –

  • weights (FloatTensor | BoolTensor) –

  • average (Literal['token', 'batch', 'none']) –

  • label_smoothing (float | None) –

  • gamma (float | None) –

  • alpha (float | List[float] | FloatTensor | None) –

nlpstack.integrations.torch.util.set_random_seed(seed)[source]#
Return type:

None

Parameters:

seed (int) –

nlpstack.integrations.torch.util.tensor_to_numpy(obj)[source]#
Return type:

Any

Parameters:

obj (Any) –

nlpstack.integrations.torch.util.tiny_value_of_dtype(dtype)[source]#

Returns a moderately tiny value for a given PyTorch data type that is used to avoid numerical issues such as division by zero. This is different from info_value_of_dtype(dtype).tiny because it causes some NaN bugs. Only supports floating point dtypes.

Return type:

Union[float, int]

Parameters:

dtype (dtype) –

nlpstack.integrations.torch.util.unfold(tensor, original_length)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • tensor (TensorType) –

  • original_length (int) –

nlpstack.integrations.torch.util.viterbi_decode(tag_sequence, transition_matrix, top_k=None, tag_observations=None, allowed_start_transitions=None, allowed_end_transitions=None)[source]#

This implementation is originally from AllenNLP: allenai/allennlp

Perform Viterbi decoding in log space over a sequence given a transition matrix specifying pairwise (transition) potentials between tags and a matrix of shape (sequence_length, num_tags) specifying unary potentials for possible tags per timestep.

Parameters:
  • tag_sequence (Tensor) – A tensor of shape (sequence_length, num_tags) representing scores for a set of tags over a given sequence.

  • transition_matrix (Tensor) – A tensor of shape (num_tags, num_tags) representing the binary potentials for transitioning between a given pair of tags.

  • tag_observations (Optional[List[int]]) – A list of length sequence_length containing the class ids of observed elements in the sequence, with unobserved elements being set to -1. Note that it is possible to provide evidence which results in degenerate labelings if the sequences of tags you provide as evidence cannot transition between each other, or those transitions are extremely unlikely. In this situation we log a warning, but the responsibility for providing self-consistent evidence ultimately lies with the user.

  • allowed_start_transitions (Optional[Tensor]) – An optional tensor of shape (num_tags,) describing which tags the START token may transition to. If provided, additional transition constraints will be used for determining the start element of the sequence.

  • allowed_end_transitions (Optional[Tensor]) – An optional tensor of shape (num_tags,) describing which tags may transition to the end tag. If provided, additional transition constraints will be used for determining the end element of the sequence.

  • top_k (Optional[int]) – Optional integer specifying how many of the top paths to return. For top_k>=1, returns a tuple of two lists: top_k_paths, top_k_scores, For top_k==None, returns a flattened tuple with just the top path and its score (not in lists, for backwards compatibility).

Returns:

The tag indices of the maximum likelihood tag sequence. viterbi_score:

The score of the viterbi path.

Return type:

viterbi_path

nlpstack.integrations.torch.util.weighted_sum(matrix, attention)[source]#
Return type:

TypeVar(TensorType, bound= Tensor)

Parameters:
  • matrix (TensorType) –

  • attention (Tensor) –