Java 解析Tiff深入研究

       最近在读取客户发过来的tiff文件是,底层竟然报错了,错误:bandOffsets.length is wrong!   没办法,因为错误消息出现在tiff的read中,因此就对

底层序中tiff读取的代码进行了研究。

       之前有一篇文章,我简单的介绍了Geotools读取Tiff的代码,Java 通过geotools读取tiff,其实通过深入研究发现,原来幕后的大佬竟然是imageio-ext中的TiffImageReader,

imageio做为Java开发的人员肯定都知道,而ImageIO-ext是imageio的扩展类,我们可以到github上看到它的源码,这是一个非常强大的库,对于Java处理各种栅格数据的读写非常有帮助!

      借助这篇文章,我们需要先了解Tiff文件的具体结构,可以参考这篇文章,TIFF文件结构详解 https://blog.csdn.net/oYinHeZhiGuang/article/details/121710467  讲的很好!

      下面我们来看下imageio-ext中的tiff读取代码,主要类TiffImageReader,我们来看下Java程序是如何读取tiff文件的。

      构造方法:

public TIFFImageReader(ImageReaderSpi originatingProvider) {         super(originatingProvider);  }

 这个类需要通过一个ImageReaderSpi来实例化,其实这种SPI的设计模式,Java的很多开源项目都在用到,这里我们通过TIFFImageReaderSpi这个类即可。

 其次设置文件的路径,以及其它一些参数,通过该类的如下方法:

public void setInput(Object input,                          boolean seekForwardOnly,                          boolean ignoreMetadata)

这个方法,里面有input就是需要读取的文件,seekForwardOnly设置为true表示:只能从这个输入源按升序读取图像和元数据。ignoreMetadata设置为true表示读取忽略元数据

接下来就是对tiff元数据的读取,具体参见getImageMetadata(int imageIndex)这个方法:

public IIOMetadata getImageMetadata(int imageIndex) throws IIOException {         seekToImage(imageIndex, true);         TIFFImageMetadata im =             new TIFFImageMetadata(imageMetadata.getRootIFD().getTagSetList());         Node root =             imageMetadata.getAsTree(TIFFImageMetadata.nativeMetadataFormatName);         im.setFromTree(TIFFImageMetadata.nativeMetadataFormatName, root);         if (noData != null) {             im.setNoData(new double[] {noData, noData});         }         if (scales != null && offsets != null) {             im.setScales(scales);             im.setOffsets(offsets);         }         return im;     }

其中的seekToImage(imageIndex, true)为最主要的逻辑处理,这个方法中,第一个参数,imageIndex为tiff多页中的第几个,第二参数设置标示该tiff页是否已经被解析过

 private void seekToImage(int imageIndex, boolean optimized) throws IIOException {         checkIndex(imageIndex);          // TODO we should do this initialization just once!!!         int index = locateImage(imageIndex);         if (index != imageIndex) {             throw new IndexOutOfBoundsException("imageIndex out of bounds!");         }                  final Integer i= Integer.valueOf(index);         //optimized branch         if(!optimized){                          readMetadata();             initializeFromMetadata();             return;         }         // in case we have cache the info for this page         if(pagesInfo.containsKey(i)){             // initialize from cachedinfo only if needed             // TODO Improve             if(imageMetadata == null || !initialized) {// this means the curindex has changed                 final PageInfo info = pagesInfo.get(i);                 final TIFFImageMetadata metadata = info.imageMetadata.get();                 if (metadata != null) {                     initializeFromCachedInfo(info, metadata);                     return;                 }                 pagesInfo.put(i,null);                                  }         }                  readMetadata();         initializeFromMetadata();     }

这个方法当中,第一次加载tiff,通过readMetadata()和initializeFromMetadata()将tiff的元信息缓存起来,方便后面再次读取。

读取过程

主要是要结合Tiff的格式进行理解,大体主要是解析tiff头,然后获取到IFD(tiff的图像目录信息),然后再依次去解析每个目录的具体内容,代码就不再这里罗列了。

这里主要说下,解析目录信息是获取tiff的元信息的过程,通常是解析每个tag的信息,解析代码TIFFIFD类的initialize(ImageInputStream stream,  boolean ignoreUnknownFields, final boolean isBTIFF)方法中

Java 解析Tiff深入研究

public void initialize(ImageInputStream stream,             boolean ignoreUnknownFields, final boolean isBTIFF) throws IOException {         removeTIFFFields();          List tagSetList = getTagSetList();                  final long numEntries;         if(isBTIFF)             numEntries= stream.readLong();         else             numEntries= stream.readUnsignedShort();                  for (int i = 0; i < numEntries; i++) {             // Read tag number, value type, and value count.             int tag = stream.readUnsignedShort();             int type = stream.readUnsignedShort();             int count;             if(isBTIFF)             {                 long count_=stream.readLong();                 count = (int)count_;                 if(count!=count_)                     throw new IllegalArgumentException("unable to use long number of values");             }             else                             count = (int)stream.readUnsignedInt();              // Get the associated TIFFTag.             TIFFTag tiffTag = getTag(tag, tagSetList);              // Ignore unknown fields.             if(ignoreUnknownFields && tiffTag == null) {                 // Skip the value/offset so as to leave the stream                 // position at the start of the next IFD entry.                  if(isBTIFF)                     stream.skipBytes(8);                 else                     stream.skipBytes(4);                  // XXX Warning message ...                  // Continue with the next IFD entry.                 continue;             }                     long nextTagOffset;                          if(isBTIFF){                 nextTagOffset = stream.getStreamPosition() + 8;                 int sizeOfType = TIFFTag.getSizeOfType(type);                 if (count*sizeOfType > 8) {                     long value = stream.readLong();                     stream.seek(value);                  }             }             else{                                 nextTagOffset = stream.getStreamPosition() + 4;                 int sizeOfType = TIFFTag.getSizeOfType(type);                  if (count*sizeOfType > 4) {                     long value = stream.readUnsignedInt();                     stream.seek(value);                  }             }                          if (tag == BaselineTIFFTagSet.TAG_STRIP_BYTE_COUNTS ||                 tag == BaselineTIFFTagSet.TAG_TILE_BYTE_COUNTS ||                 tag == BaselineTIFFTagSet.TAG_JPEG_INTERCHANGE_FORMAT_LENGTH) {                 this.stripOrTileByteCountsPosition =                     stream.getStreamPosition();                 if (LAZY_LOADING) {                     type = type == TIFFTag.TIFF_LONG ? TIFFTag.TIFF_LAZY_LONG : TIFFTag.TIFF_LAZY_LONG8;                 }             } else if (tag == BaselineTIFFTagSet.TAG_STRIP_OFFSETS ||                        tag == BaselineTIFFTagSet.TAG_TILE_OFFSETS ||                        tag == BaselineTIFFTagSet.TAG_JPEG_INTERCHANGE_FORMAT) {                 this.stripOrTileOffsetsPosition =                     stream.getStreamPosition();                 if (LAZY_LOADING) {                     type = type == TIFFTag.TIFF_LONG ? TIFFTag.TIFF_LAZY_LONG : TIFFTag.TIFF_LAZY_LONG8;                 }             }              Object obj = null;              try {                 switch (type) {                 case TIFFTag.TIFF_BYTE:                 case TIFFTag.TIFF_SBYTE:                 case TIFFTag.TIFF_UNDEFINED:                 case TIFFTag.TIFF_ASCII:                     byte[] bvalues = new byte[count];                     stream.readFully(bvalues, 0, count);                                      if (type == TIFFTag.TIFF_ASCII) {                         // Can be multiple strings                         final List<String> v = new ArrayList<String>();                         boolean inString = false;                         int prevIndex = 0;                         for (int index = 0; index <= count; index++) {                             if (index < count && bvalues[index] != 0) {                                 if (!inString) {                                 // start of string                                     prevIndex = index;                                     inString = true;                                 }                             } else { // null or special case at end of string                                 if (inString) {                                 // end of string                                     final String s = new String(bvalues, prevIndex,index - prevIndex);                                     v.add(s);                                     inString = false;                                 }                             }                         }                          count = v.size();                         String[] strings;                         if(count != 0) {                             strings = new String[count];                             for (int c = 0 ; c < count; c++) {                                 strings[c] = v.get(c);                             }                         } else {                             // This case has been observed when the value of                             // 'count' recorded in the field is non-zero but                             // the value portion contains all nulls.                             count = 1;                             strings = new String[] {""};                         }                                              obj = strings;                     } else {                         obj = bvalues;                     }                     break;                                  case TIFFTag.TIFF_SHORT:                     char[] cvalues = new char[count];                     for (int j = 0; j < count; j++) {                         cvalues[j] = (char)(stream.readUnsignedShort());                     }                     obj = cvalues;                     break;                                  case TIFFTag.TIFF_LONG:                 case TIFFTag.TIFF_IFD_POINTER:                     long[] lvalues = new long[count];                     for (int j = 0; j < count; j++) {                         lvalues[j] = stream.readUnsignedInt();                     }                     obj = lvalues;                     break;                                  case TIFFTag.TIFF_RATIONAL:                     long[][] llvalues = new long[count][2];                     for (int j = 0; j < count; j++) {                         llvalues[j][0] = stream.readUnsignedInt();                         llvalues[j][1] = stream.readUnsignedInt();                     }                     obj = llvalues;                     break;                                  case TIFFTag.TIFF_SSHORT:                     short[] svalues = new short[count];                     for (int j = 0; j < count; j++) {                         svalues[j] = stream.readShort();                     }                     obj = svalues;                     break;                                  case TIFFTag.TIFF_SLONG:                     int[] ivalues = new int[count];                     for (int j = 0; j < count; j++) {                         ivalues[j] = stream.readInt();                     }                     obj = ivalues;                     break;                                  case TIFFTag.TIFF_SRATIONAL:                     int[][] iivalues = new int[count][2];                     for (int j = 0; j < count; j++) {                         iivalues[j][0] = stream.readInt();                         iivalues[j][1] = stream.readInt();                     }                     obj = iivalues;                     break;                                  case TIFFTag.TIFF_FLOAT:                     float[] fvalues = new float[count];                     for (int j = 0; j < count; j++) {                         fvalues[j] = stream.readFloat();                     }                     obj = fvalues;                     break;                                  case TIFFTag.TIFF_DOUBLE:                     double[] dvalues = new double[count];                     for (int j = 0; j < count; j++) {                         dvalues[j] = stream.readDouble();                     }                     obj = dvalues;                     break;                                             case TIFFTag.TIFF_LONG8:                 case TIFFTag.TIFF_SLONG8:                     case TIFFTag.TIFF_IFD8:                     long[] lBvalues = new long[count];                     for (int j = 0; j < count; j++) {                         lBvalues[j] = stream.readLong();                     }                     obj = lBvalues;                     break;                                  case TIFFTag.TIFF_LAZY_LONG8:                    case TIFFTag.TIFF_LAZY_LONG:                        obj = new TIFFLazyData(stream, type, count);                     break;                 default:                     // XXX Warning                     break;                 }             } catch(EOFException eofe) {                 // The TIFF 6.0 fields have tag numbers less than or equal                 // to 532 (ReferenceBlackWhite) or equal to 33432 (Copyright).                 // If there is an error reading a baseline tag, then re-throw                 // the exception and fail; otherwise continue with the next                 // field.                 if(BaselineTIFFTagSet.getInstance().getTag(tag) == null) {                     throw eofe;                 }             }                          if (tiffTag == null) {                 // XXX Warning: unknown tag             } else if (!tiffTag.isDataTypeOK(type)) {                 // XXX Warning: bad data type             } else if (tiffTag.isIFDPointer() && obj != null) {                 stream.mark();                 stream.seek(((long[])obj)[0]);                  List tagSets = new ArrayList(1);                 tagSets.add(tiffTag.getTagSet());                 TIFFIFD subIFD = new TIFFIFD(tagSets);                  // XXX Use same ignore policy for sub-IFD fields?                 subIFD.initialize(stream, ignoreUnknownFields);                 obj = subIFD;                 stream.reset();             }              if (tiffTag == null) {                 tiffTag = new TIFFTag(null, tag, 1 << type, null);             }              // Add the field if its contents have been initialized which             // will not be the case if an EOF was ignored above.             if(obj != null) {                 TIFFField f = new TIFFField(tiffTag, type, count, obj);                 addTIFFField(f);             }              stream.seek(nextTagOffset);         }          this.lastPosition = stream.getStreamPosition();     }

View Code

Tiff常用的Tag标签类有BaseLineTiffTagSet、FaxTiffTagSet、GeoTiffTagSet、EXIFPTiffTagSet、PrivateTIFFTagSet等。

其中的GeoTiffTagSet用于geotiff的额外存储信息,在这里说明下,Geotiff是Tiff格式对Gis数据的一种存储支持,而PrivateTIFFTagSet是对gdal的支持,增加了NODATA、MEATADATA的信息。

 对于文章开头提的关于bandOffsets.length is wrong!,主要原因出现在getImageTypes(int imageIndex)这个方法的下面这个实现中。

ImageTypeSpecifier itsRaw =              TIFFDecompressor.getRawImageTypeSpecifier                 (photometricInterpretation,                  compression,                  samplesPerPixel,                  bitsPerSample,                  sampleFormat,                  extraSamples,                  colorMap);

最终我们在ImageTypeSpecifier这个类的Interleaved(ColorSpace colorSpace,int[] bandOffsets,int dataType,boolean hasAlpha,boolean isAlphaPremultiplied) 方法中发现问题。

public Interleaved(ColorSpace colorSpace,                            int[] bandOffsets,                            int dataType,                            boolean hasAlpha,                            boolean isAlphaPremultiplied) {             if (colorSpace == null) {                 throw new IllegalArgumentException("colorSpace == null!");             }             if (bandOffsets == null) {                 throw new IllegalArgumentException("bandOffsets == null!");             }             int numBands = colorSpace.getNumComponents() +                 (hasAlpha ? 1 : 0);             if (bandOffsets.length != numBands) {                 throw new IllegalArgumentException                     ("bandOffsets.length is wrong!");             }

我们发现只有当我们的图像偏移数量和我们的通道数不一致的时候,就会报这个错误!

总结

通过研究这个问题,基本上梳理了Java基于ImageIO-ext读取tiff的过程,基本跟tiff的数据结构对应起来。

 

 

 

 

 

 

 

 

发表评论

评论已关闭。

相关文章