You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

data_loading.md 2.7 kB

3 years ago
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
  1. # Use Custom Dataloaders
  2. ## How the Existing Dataloader Works
  3. Detectron2 contains a builtin data loading pipeline.
  4. It's good to understand how it works, in case you need to write a custom one.
  5. Detectron2 provides two functions
  6. [build_detection_{train,test}_loader](../modules/data.html#detectron2.data.build_detection_train_loader)
  7. that create a default data loader from a given config.
  8. Here is how `build_detection_{train,test}_loader` work:
  9. 1. It takes the name of the dataset (e.g., "coco_2017_train") and loads a `list[dict]` representing the dataset items
  10. in a lightweight, canonical format. These dataset items are not yet ready to be used by the model (e.g., images are
  11. not loaded into memory, random augmentations have not been applied, etc.).
  12. Details about the dataset format and dataset registration can be found in
  13. [datasets](datasets.html).
  14. 2. Each dict in this list is mapped by a function ("mapper"):
  15. * Users can customize this mapping function by specifying the "mapper" argument in
  16. `build_detection_{train,test}_loader`. The default mapper is [DatasetMapper]( ../modules/data.html#detectron2.data.DatasetMapper)
  17. * The output format of such function can be arbitrary, as long as it is accepted by the consumer of this data loader (usually the model).
  18. * The role of the mapper is to transform the lightweight, canonical representation of a dataset item into a format
  19. that is ready for the model to consume (including, e.g., read images, perform random data augmentation and convert to torch Tensors).
  20. The output format of the default mapper is explained below.
  21. 3. The outputs of the mapper are batched (simply into a list).
  22. 4. This batched data is the output of the data loader. Typically, it's also the input of
  23. `model.forward()`.
  24. ## Write a Custom Dataloader
  25. Using a different "mapper" with `build_detection_{train,test}_loader(mapper=)` works for most use cases
  26. of custom data loading. Refer to [API documentation](../modules/data.html) for details.
  27. If you want to do something different (e.g., use different sampling or batching logic),
  28. you can write your own data loader. The data loader is simply a
  29. python iterator that produces [the format](models.html) your model accepts.
  30. You can implement it using any tools you like.
  31. ## Use a Custom Dataloader
  32. If you use [DefaultTrainer](../modules/engine.html#detectron2.engine.defaults.DefaultTrainer),
  33. you can overwrite its `build_{train,test}__loader` method to use your own dataloader.
  34. See the [densepose dataloader](/projects/DensePose/train_net.py)
  35. for an example.
  36. If you write your own training loop, you can plug in your data loader easily.

No Description