Determining a File Type In Java

In most applications, we need to download and upload file features. During these downloads and uploads, we sometimes need to specify the format of the file, or we need to be sure that the file has the same format with the format that was chosen by the user. For these needs, we can use several approaches in Java. We will list these approaches in this article.

1. Files.probeContentType(Path)

With the probeContentType method of the java.nio.file.Files class that came with Java 7, we can get the type of the file that we gave in the path name which we passed in as the parameter to the method.

Below, we gave the name of the example file JPG_Test_File.jpg to getFileTypeByProbeContentType method. In this method, we call Files.probeContentType(Path) method, and we get image/jpeg as the file type.

Java


Output: image/jpeg

But, if we only change the extension of the file and make it PPTX and give the new file name as parameter to the same method we don’t get the same result:

Output: application/vnd.openxmlformats-officedocument.presentationml.presentation

If we rename the file and remove the extension completely, we couldn’t get a file type by the same method.

Output : null

2. MimetypesFileTypeMap.getContentType(String)

We can use the file's name and pass it to the getContentType method of MimetypesFileTypeMap class came with Java 6 in order to get the file type.

Here is our method:

Java


If we call this method for the file we changed the extension of to PPTX, we get the following result as file type:

Output: application/octet-stream

3. URLConnection.getContentType()

With the getContentType method of the URLConnection, class we can get content type of a file. 

Java


If we call this method for the file that we changed the extension of to PPTX, we get the following result as file type:

Output: content/unknown

4. Apache Tika

Previous three approaches are provided by the JDK. However, there are others like Apache Tika. Apache Tika is a very successful library and is good at detecting file type via analyzing file content independently of its extension.

Our method gets InpustStream as parameter and uses detect method of Apache Tika:

Java


If we convert the file that was originally in JPEG format to FileInputStream, but we change the extension of it to PPTX and give it as a parameter to getFileTpyeByTika, we get the following result:

Output: image/jpeg

Tika detected the type of file correctly.

We can use the detect method of Apache Tika with parameter has type of File, instead of InputStream. We will use following method to use the detect method of Tika with File parameter:

Java


If we provide the file that was originally in JPEG format, but we change the extension to PPTX again, Tika will detect the file type correctly:

Output: image/jpeg

As we can see, Apache Tika can detect the file type correctly despite the change of the file extension. We can use Apache Tika when the file type is crucial or file type can effect the flow of an application.

 

 

 

 

Top