[This is preliminary documentation and is subject to change.]
The Dfs framework includes base classes to support reading and writing Hdfs files, and is intended to be extensible to other
block-based file systems
Classes
Class | Description | |
---|---|---|
DfsBaseCoordinatorTWorkDescription |
base class for Dfs coordinators, that expands directories into sets of files to read, and keeps track
of the mapping from IPAddresses to IPEndpoints
| |
DfsBlock |
Work item describing an Dfs block to be read at a worker.
| |
DfsBlockCoordinatorTItem |
Implementation of a MatchingCoordinator that manages work for reading dfs files in parallel, split by blocks
| |
DfsBlockWorkerTItem, TOutput |
IWorker implementation for the dfs worker that reads data from blocks. This is further specialized to different file
formats by passing in functions for syncing to record boundaries and deserializing data
| |
DfsFileCoordinator |
base coordinator for workers that read an entire hdfs file at a time, rather than split the file into blocks.
For each file the coordinator tries to match it to a worker that holds a large proportion of the relevant data
| |
DfsFileWorkerTOutput |
base worker implementation for the worker to read Hdfs files an entire file at a time, rather than block by block
| |
DfsTextCoordinator |
the coordinator class for a text reader. No additional metadata is needed to describe a block, so this just uses DfsBlocks
directly as work items
| |
DfsTextWorker |
the worker class for a text reader for files with fixed-length blocks and text records. It uses DfsBlocks as work
items, and parses data into lines represented as strings
|