sparse_caption.data package

Submodules

sparse_caption.data.collate module

Created on 21 Apr 2020 22:25:24 @author: jiahuei

class sparse_caption.data.collate.AttCollate(*args, **kwargs)

Bases: sparse_caption.data.collate.UpDownCollate

static add_argparse_args(parser: Union[argparse._ArgumentGroup, argparse.ArgumentParser])
class sparse_caption.data.collate.BasicCollate(config, img_transform: torchvision.transforms.transforms.Compose, tokenizer: sparse_caption.tokenizer.Tokenizer)

Bases: object

class sparse_caption.data.collate.ListDataset(data: List)

Bases: torch.utils.data.dataset.Dataset

Basically a list but is a subclass ofm Dataset.

class sparse_caption.data.collate.ObjectRelationCollate(*args, **kwargs)

Bases: sparse_caption.data.collate.UpDownCollate

static add_argparse_args(parser: Union[argparse._ArgumentGroup, argparse.ArgumentParser])
class sparse_caption.data.collate.UpDownCollate(config, tokenizer: sparse_caption.tokenizer.Tokenizer, cache_dict: Optional[Dict] = None)

Bases: object

static add_argparse_args(parser: Union[argparse._ArgumentGroup, argparse.ArgumentParser])
join_default_bu_dir(dirname)

sparse_caption.data.karpathy module

Created on 01 Aug 2020 18:19:23 @author: jiahuei

class sparse_caption.data.karpathy.KarpathyDataset(config: sparse_caption.utils.config.Config)

Bases: abc.ABC

Captioning data and inputs.

ANNOTATION_FILE = ''
DEFAULT_ANNOT_DIR = '/home/docs/checkouts/readthedocs.org/user_builds/sparse-image-captioning/checkouts/latest/sparse_caption/coco_caption/annotations'
RAW_JSON_FILE = ''
static add_argparse_args(parser: Union[argparse._ArgumentGroup, argparse.ArgumentParser])
coco_annot_json_dump() None

Generate COCO-style annotation file, if it does not exist.

coco_caption_json_dump(img_fname_caption_pair: Iterable[Tuple[str, ...]], output_fpath: str) None

Takes in [(img_fname_str, caption_str), …] as img_fname_caption_pair, and saves the results as a JSON file compatible with coco_caption evaluation format.

Parameters
  • img_fname_caption_pair – An Iterable of (image name, caption string).

  • output_fpath – The path for the output file.

Returns

The file path of the JSON file.

download_and_process_karpathy_json()
get_split(split: str, generation_mode: bool = False)
abstract static image_filename_to_id(filename: str) int

Given the file name of an image, return its image ID (integer).

abstract prepare_data()

Download, process, tokenize, etc.

random_image_check(num_samples: int = 5) None
train_captions_txt_dump() None

Generate a text file, one sentence per line. Used to train tokenizer.

property train_size

sparse_caption.data.mscoco module

Created on 21 Apr 2020 00:25:38 @author: jiahuei

class sparse_caption.data.mscoco.MscocoDataset(config)

Bases: sparse_caption.data.karpathy.KarpathyDataset

COCO data and inputs.

ANNOTATION_FILE = 'captions_val2014.json'
RAW_JSON_FILE = 'dataset_coco.json'
static add_argparse_args(parser: Union[argparse._ArgumentGroup, argparse.ArgumentParser])

Adds dataset arguments to ArgumentParser.

get_test2014_split()
static image_filename_to_id(filename: str) int

Given the file name of an image, return its image ID (integer).

prepare_data()

Download, process, tokenize, etc.

class sparse_caption.data.mscoco.MscocoTesting(config)

Bases: sparse_caption.data.mscoco.MscocoDataset

RAW_JSON_FILE = 'dataset_coco_testing.json'

Module contents

Created on 20 Apr 2020 19:00:29 @author: jiahuei

Adapted from:

https://raw.githubusercontent.com/pytorch/fairseq/v0.9.0/fairseq/models/__init__.py http://scottlobdell.me/2015/08/using-decorators-python-automatic-registration/

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree.

sparse_caption.data.get_dataset(name: str) Type[sparse_caption.data.karpathy.KarpathyDataset]
sparse_caption.data.register_dataset(name)

New datasets can be added with the register_dataset() function decorator.

For example:

@register_dataset('mscoco')
class MscocoDataset:
    (...)
Parameters

name (str) – the name of the model