sparse_caption.models package
Submodules
sparse_caption.models.att_model module
Created on 14 Oct 2020 14:19:19 https://github.com/ruotianluo/self-critical.pytorch/tree/3.2
This file contains UpDown model
UpDown is from Bottom-Up and Top-Down Attention for Image Captioning and VQA https://arxiv.org/abs/1707.07998 However, it may not be identical to the author’s architecture.
- class sparse_caption.models.att_model.AttModel(config, tokenizer: Optional[sparse_caption.tokenizer.Tokenizer] = None)
Bases:
sparse_caption.models.caption_model.CaptionModel- clip_att(att_feats, att_masks)
- get_logprobs_state(it, fc_feats, att_feats, p_att_feats, att_masks, state, output_logsoftmax=1)
- make_model()
- training: bool
- class sparse_caption.models.att_model.Attention(config)
Bases:
torch.nn.modules.module.Module- forward(h, att_feats, p_att_feats, att_masks=None)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class sparse_caption.models.att_model.UpDownCore(config, use_maxout=False)
Bases:
torch.nn.modules.module.Module- forward(xt, fc_feats, att_feats, p_att_feats, state, att_masks=None)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class sparse_caption.models.att_model.UpDownModel(config, tokenizer: Optional[sparse_caption.tokenizer.Tokenizer] = None)
Bases:
sparse_caption.models.att_model.AttModel- COLLATE_FN
- static add_argparse_args(parser: Union[argparse._ArgumentGroup, argparse.ArgumentParser])
- training: bool
sparse_caption.models.att_model_prune module
Created on 14 Oct 2020 14:34:47 @author: jiahuei
- class sparse_caption.models.att_model_prune.AttModel(config, tokenizer: Optional[sparse_caption.tokenizer.Tokenizer] = None)
Bases:
sparse_caption.models.att_model.AttModel- make_model()
- training: bool
- class sparse_caption.models.att_model_prune.Attention(config)
Bases:
sparse_caption.models.att_model.Attention- training: bool
- class sparse_caption.models.att_model_prune.UpDownCore(config, use_maxout=False)
Bases:
sparse_caption.models.att_model.UpDownCore- training: bool
- class sparse_caption.models.att_model_prune.UpDownModel(config, tokenizer: Optional[sparse_caption.tokenizer.Tokenizer] = None)
Bases:
sparse_caption.pruning.prune.PruningMixin,sparse_caption.models.att_model_prune.AttModel- COLLATE_FN
- static add_argparse_args(parser: Union[argparse._ArgumentGroup, argparse.ArgumentParser])
sparse_caption.models.caption_model module
https://github.com/ruotianluo/self-critical.pytorch/tree/3.2
This file contains ShowAttendTell and AllImg model
ShowAttendTell is from Show, Attend and Tell: Neural Image Caption Generation with Visual Attention https://arxiv.org/abs/1502.03044
AllImg is a model where img feature is concatenated with word embedding at every time step as the input of lstm
- class sparse_caption.models.caption_model.CaptionModel
Bases:
torch.nn.modules.module.Module- batch_beam_search(init_state, init_logprobs, *args, **kwargs)
- forward(*args, **kwargs)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- static sample_next_word(logprobs, sample_method, temperature)
- training: bool
sparse_caption.models.relation_transformer module
https://github.com/yahoo/object_relation_transformer
# Please see LICENSE file in the project root for terms.
- class sparse_caption.models.relation_transformer.BoxMultiHeadedAttention(h, d_model, trigonometric_embedding=True, dropout=0.1, share_att=None)
Bases:
torch.nn.modules.module.ModuleSelf-attention layer with relative position weights. Following the paper “Relation Networks for Object Detection” in https://arxiv.org/pdf/1711.11575.pdf
- static BoxRelationalEmbedding(f_g, dim_g=64, wave_len=1000, trigonometric_embedding=True)
Given a tensor with bbox coordinates for detected objects on each batch image, this function computes a matrix for each image
with entry (i,j) given by a vector representation of the displacement between the coordinates of bbox_i, and bbox_j
input: np.array of shape=(batch_size, max_nr_bounding_boxes, 4) output: np.array of shape=(batch_size, max_nr_bounding_boxes, max_nr_bounding_boxes, 64)
- static box_attention(query, key, value, box_relation_embds_matrix, mask=None, dropout=None)
Compute ‘Scaled Dot Product Attention as in paper Relation Networks for Object Detection’. Follow the implementation in https://github.com/heefe92/Relation_Networks-pytorch/blob/master/model.py#L1026-L1055
- forward(input_query, input_key, input_value, input_box, mask=None)
Implements Figure 2 of Relation Network for Object Detection
- training: bool
- class sparse_caption.models.relation_transformer.Encoder(layer, N, share_layer=None)
Bases:
torch.nn.modules.module.ModuleCore encoder is a stack of N layers
- forward(x, box, mask)
Pass the input (and mask) through each layer in turn.
- training: bool
- class sparse_caption.models.relation_transformer.EncoderDecoder(encoder, decoder, src_embed, tgt_embed, generator)
Bases:
torch.nn.modules.module.ModuleA standard Encoder-Decoder architecture. Base for this and many other models.
- decode(memory, src_mask, tgt, tgt_mask)
- encode(src, boxes, src_mask)
- forward(src, boxes, tgt, src_mask, tgt_mask)
Take in and process masked src and target sequences.
- training: bool
- class sparse_caption.models.relation_transformer.EncoderLayer(size, self_attn, feed_forward, dropout)
Bases:
torch.nn.modules.module.ModuleEncoder is made up of self-attn and feed forward (defined below)
- forward(x, box, mask)
Follow Figure 1 (left) for connections.
- training: bool
- class sparse_caption.models.relation_transformer.RelationTransformerModel(config)
Bases:
sparse_caption.models.transformer.CachedTransformerBase- COLLATE_FN
- static add_argparse_args(parser: Union[argparse._ArgumentGroup, argparse.ArgumentParser])
- static clip_att(att_feats, att_masks)
- get_logprobs_state(it, memory, mask, state)
state = [ys.unsqueeze(0)]
- make_model(h=8, dropout=0.1)
Helper: Construct a model from hyperparameters.
- static subsequent_mask(size)
Mask out subsequent positions.
- training: bool
sparse_caption.models.relation_transformer_prune module
Created on 09 Oct 2020 17:27:20 @author: jiahuei
- class sparse_caption.models.relation_transformer_prune.BoxMultiHeadedAttention(mask_type, mask_init_value, h, d_model, trigonometric_embedding=True, dropout=0.03333333333333333, share_att=None)
Bases:
sparse_caption.models.relation_transformer.BoxMultiHeadedAttentionSelf-attention layer with relative position weights. Following the paper “Relation Networks for Object Detection” in https://arxiv.org/pdf/1711.11575.pdf
- training: bool
- class sparse_caption.models.relation_transformer_prune.CachedMultiHeadedAttention(mask_type, mask_init_value, h, d_model, dropout=0.03333333333333333, self_attention=False, share_att=None)
Bases:
sparse_caption.models.transformer.CachedMultiHeadedAttention- training: bool
- class sparse_caption.models.relation_transformer_prune.Embeddings(mask_type, mask_init_value, d_model, vocab)
Bases:
sparse_caption.models.transformer.InputEmbedding- training: bool
- class sparse_caption.models.relation_transformer_prune.EncoderDecoder(*, mask_type, mask_freeze_scope='', **kwargs)
Bases:
sparse_caption.pruning.prune.PruningMixin,sparse_caption.models.relation_transformer.EncoderDecoderA standard Encoder-Decoder architecture. Base for this and many other models.
- class sparse_caption.models.relation_transformer_prune.Generator(mask_type, mask_init_value, d_model, vocab)
Bases:
sparse_caption.models.transformer.OutputEmbeddingDefine standard linear + softmax generation step.
- training: bool
- class sparse_caption.models.relation_transformer_prune.PositionwiseFeedForward(mask_type, mask_init_value, d_model, d_ff, dropout=0.03333333333333333)
Bases:
sparse_caption.models.transformer.PositionwiseFeedForwardImplements FFN equation.
- training: bool
- class sparse_caption.models.relation_transformer_prune.RelationTransformerModel(config)
Bases:
sparse_caption.pruning.prune.PruningMixin,sparse_caption.models.relation_transformer.RelationTransformerModel- static add_argparse_args(parser: Union[argparse._ArgumentGroup, argparse.ArgumentParser])
- make_model(h=8, dropout=0.03333333333333333)
Helper: Construct a model from hyperparameters.
sparse_caption.models.transformer module
Created on 28 Dec 2020 18:00:01 @author: jiahuei
Based on The Annotated Transformer https://nlp.seas.harvard.edu/2018/04/03/attention.html
- sparse_caption.models.transformer.CMHA
alias of
sparse_caption.models.transformer.CachedMultiHeadedAttention
- class sparse_caption.models.transformer.CachedMultiHeadedAttention(*args, **kwargs)
Bases:
sparse_caption.models.transformer.MultiHeadedAttention- reset_cache()
- training: bool
- class sparse_caption.models.transformer.CachedTransformerBase(config)
Bases:
sparse_caption.models.caption_model.CaptionModel- static add_argparse_args(parser: Union[argparse._ArgumentGroup, argparse.ArgumentParser])
- static disable_incremental_decoding(module)
- static enable_incremental_decoding(module)
- training: bool
- class sparse_caption.models.transformer.Decoder(layer, N, share_layer=None)
Bases:
torch.nn.modules.module.ModuleGeneric N layer decoder with masking.
- forward(x, memory, src_mask, tgt_mask)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class sparse_caption.models.transformer.DecoderLayer(size, self_attn, src_attn, feed_forward, dropout)
Bases:
torch.nn.modules.module.ModuleDecoder is made of self-attn, src-attn, and feed forward (defined below)
- forward(x, memory, src_mask, tgt_mask)
Follow Figure 1 (right) for connections.
- training: bool
- class sparse_caption.models.transformer.Encoder(layer, N, share_layer=None)
Bases:
torch.nn.modules.module.ModuleCore encoder is a stack of N layers
- forward(x, mask)
Pass the input (and mask) through each layer in turn.
- training: bool
- class sparse_caption.models.transformer.EncoderDecoder(encoder: Callable, decoder: Callable, src_embed: Callable, tgt_embed: Callable, generator: Callable, autoregressive: bool = True, pad_idx: int = 0)
Bases:
torch.nn.modules.module.ModuleA standard Encoder-Decoder architecture. Base for this and many other models.
- decode(tgt: torch.Tensor, memory: torch.Tensor, memory_mask: torch.Tensor)
- Parameters
tgt – (N, T)
memory – (N, S, E)
memory_mask – (N, S)
Returns:
- encode(src: torch.Tensor, src_mask: torch.Tensor)
- Parameters
src – (N, S, E)
src_mask – (N, S)
Returns:
- forward(src: torch.Tensor, src_mask: torch.Tensor, tgt: torch.Tensor)
- Parameters
src – (N, S, E)
src_mask – (N, S)
tgt – (N, T)
Returns:
- generate(x)
- training: bool
- class sparse_caption.models.transformer.EncoderLayer(size, self_attn, feed_forward, dropout)
Bases:
torch.nn.modules.module.ModuleEncoder is made up of self-attn and feed forward
- forward(x, mask)
Follow Figure 1 (left) for connections.
- training: bool
- class sparse_caption.models.transformer.InputEmbedding(d_model, vocab)
Bases:
torch.nn.modules.module.Module- forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class sparse_caption.models.transformer.LayerNorm(features, eps=1e-06)
Bases:
torch.nn.modules.module.ModuleConstruct a layernorm module (See citation for details).
- forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- sparse_caption.models.transformer.MHA
alias of
sparse_caption.models.transformer.MultiHeadedAttention
- class sparse_caption.models.transformer.MultiHeadedAttention(h, d_model, dropout=0.1, self_attention=False, share_att=None)
Bases:
torch.nn.modules.module.Module- static attention(query, key, value, mask=None, dropout=None)
Compute ‘Scaled Dot Product Attention’
- forward(query, key, value, mask=None)
Implements Figure 2
- training: bool
- class sparse_caption.models.transformer.OutputEmbedding(d_model, vocab)
Bases:
torch.nn.modules.module.ModuleDefine standard linear + softmax generation step.
- forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class sparse_caption.models.transformer.PositionalEncoding(d_model, dropout, max_len=5000)
Bases:
torch.nn.modules.module.ModuleImplement the PE function.
- forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- reset_cache()
- training: bool
- class sparse_caption.models.transformer.PositionwiseFeedForward(d_model, d_ff, dropout=0.1)
Bases:
torch.nn.modules.module.ModuleImplements FFN equation.
- forward(x)
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class sparse_caption.models.transformer.SublayerConnection(size, dropout)
Bases:
torch.nn.modules.module.ModuleA residual connection followed by a layer norm. Note for code simplicity the norm is first as opposed to last.
- forward(x, sublayer)
Apply residual connection to any sublayer with the same size.
- training: bool
- class sparse_caption.models.transformer.Transformer(config)
Bases:
sparse_caption.models.transformer.CachedTransformerBase- COLLATE_FN
- static add_argparse_args(parser: Union[argparse._ArgumentGroup, argparse.ArgumentParser])
- get_logprobs_state(it, memory, mask, state)
state = [ys.unsqueeze(0)]
- make_model()
- training: bool
Module contents
Created on 28 Aug 2020 12:43:22 @author: jiahuei
- sparse_caption.models.get_model(name: str) Any
- sparse_caption.models.register_model(name)
New models can be added with the
register_model()function decorator.For example:
@register_model('relation_transformer') class RelationTransformerModel: (...)
- Parameters
name (str) – the name of the model