public class NLineInputFormat extends FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
LINES_PER_MAP |
Constructor and Description |
---|
NLineInputFormat() |
Modifier and Type | Method and Description |
---|---|
RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> |
createRecordReader(InputSplit genericSplit,
TaskAttemptContext context)
Create a record reader for a given split.
|
static int |
getNumLinesPerSplit(JobContext job)
Get the number of lines per split
|
java.util.List<InputSplit> |
getSplits(JobContext job)
Logically splits the set of input files for the job, splits N lines
of the input as one split.
|
static java.util.List<FileSplit> |
getSplitsForFile(org.apache.hadoop.fs.FileStatus status,
org.apache.hadoop.conf.Configuration conf,
int numLinesPerSplit) |
static void |
setNumLinesPerSplit(Job job,
int numLines)
Set the number of lines per split
|
addInputPath, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, isSplitable, listStatus, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize
public static final java.lang.String LINES_PER_MAP
public RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> createRecordReader(InputSplit genericSplit, TaskAttemptContext context) throws java.io.IOException
InputFormat
RecordReader.initialize(InputSplit, TaskAttemptContext)
before
the split is used.createRecordReader
in class InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
genericSplit
- the split to be readcontext
- the information about the taskjava.io.IOException
public java.util.List<InputSplit> getSplits(JobContext job) throws java.io.IOException
getSplits
in class FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
job
- job configuration.InputSplit
s for the job.java.io.IOException
FileInputFormat.getSplits(JobContext)
public static java.util.List<FileSplit> getSplitsForFile(org.apache.hadoop.fs.FileStatus status, org.apache.hadoop.conf.Configuration conf, int numLinesPerSplit) throws java.io.IOException
java.io.IOException
public static void setNumLinesPerSplit(Job job, int numLines)
job
- the job to modifynumLines
- the number of lines per splitpublic static int getNumLinesPerSplit(JobContext job)
job
- the jobCopyright © 2009 The Apache Software Foundation