| @@ -560,9 +560,9 @@ class Dataset: | |||
| Note: | |||
| 1. If count is greater than the number of element in dataset or equal to -1, | |||
| all the element in dataset will be taken. | |||
| all the element in dataset will be taken. | |||
| 2. The order of using take and batch effects. If take before batch operation, | |||
| then taken given number of rows, otherwise take given number of batches. | |||
| then taken given number of rows, otherwise take given number of batches. | |||
| Args: | |||
| count (int, optional): Number of elements to be taken from the dataset (default=-1). | |||
| @@ -590,7 +590,7 @@ class Dataset: | |||
| # here again | |||
| dataset_size = self.get_dataset_size() | |||
| if(dataset_size is None or dataset_size <= 0): | |||
| if dataset_size is None or dataset_size <= 0: | |||
| raise RuntimeError("dataset size unknown, unable to split.") | |||
| all_int = all(isinstance(item, int) for item in sizes) | |||
| @@ -640,8 +640,8 @@ class Dataset: | |||
| Note: | |||
| 1. Dataset cannot be sharded if split is going to be called. | |||
| 2. It is strongly recommended to not shuffle the dataset, but use randomize=True instead. | |||
| Shuffling the dataset may not be deterministic, which means the data in each split | |||
| will be different in each epoch. | |||
| Shuffling the dataset may not be deterministic, which means the data in each split | |||
| will be different in each epoch. | |||
| Raises: | |||
| RuntimeError: If get_dataset_size returns None or is not supported for this dataset. | |||
| @@ -1173,6 +1173,7 @@ class SourceDataset(Dataset): | |||
| def is_sharded(self): | |||
| raise NotImplementedError("SourceDataset must implement is_sharded.") | |||
| class MappableDataset(SourceDataset): | |||
| """ | |||
| Abstract class to represent a source dataset which supports use of samplers. | |||
| @@ -1253,13 +1254,13 @@ class MappableDataset(SourceDataset): | |||
| Note: | |||
| 1. Dataset should not be sharded if split is going to be called. Instead, create a | |||
| DistributedSampler and specify a split to shard after splitting. If dataset is | |||
| sharded after a split, it is strongly recommended to set the same seed in each instance | |||
| of execution, otherwise each shard may not be part of the same split (see Examples) | |||
| DistributedSampler and specify a split to shard after splitting. If dataset is | |||
| sharded after a split, it is strongly recommended to set the same seed in each instance | |||
| of execution, otherwise each shard may not be part of the same split (see Examples) | |||
| 2. It is strongly recommended to not shuffle the dataset, but use randomize=True instead. | |||
| Shuffling the dataset may not be deterministic, which means the data in each split | |||
| will be different in each epoch. Furthermore, if sharding occurs after split, each | |||
| shard may not be part of the same split. | |||
| Shuffling the dataset may not be deterministic, which means the data in each split | |||
| will be different in each epoch. Furthermore, if sharding occurs after split, each | |||
| shard may not be part of the same split. | |||
| Raises: | |||
| RuntimeError: If get_dataset_size returns None or is not supported for this dataset. | |||