User Id :    Password :      New Member   Forgot Password  
Reading PDF file to text in c#
Description This article shows how you can read a pdf file and put their content in a string variable very simpl   No. of Views     10052
  Rating     5
Author Sumit Gupta   Posted On     30 Apr 2011
Tags ASP.NET,C#    

Sample Code   Download Code

I used PDFBox. PDFBox is Java PDF Library but .net version is also there.

So first step is to download PDFBox from the URL

Then add the reference of following two file from the bin directory of downloaded file



Then put the following code in a class file to read pdf file:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using org.pdfbox.pdmodel;
using org.pdfbox.util;

/// <summary>
/// Summary description for ConvertFromPDF
/// </summary>
public class ConvertFromPDF
    public static string parseUsingPDFBox(string filename)
        PDDocument doc = PDDocument.load(filename);
        PDFTextStripper stripper = new PDFTextStripper();
        return stripper.getText(doc);


About Author

About Author I am Sumit Gupta working in 3 Pillar Global Pvt. Ltd as Module Lead. I have 7+ year of experience in .Net technologies. I love to explore new technologies and write technical article. Sumit Gupta
No Photo
Country India
Company 3 Pillar Global Pvt. Ltd.
Home Page

Rate this article

Rating options from poor, fair, good, very good to excelent.  


Posted By Akhil on 17 Aug 2011 at 09:49 PM
very helpful topic grate man...
Write your comment here.
Verification Code