tika-lambda

Run Apache Tika as a service in AWS Lambda by scanning documents in S3 and storing the extracted text back to S3

Main changes from the original version:

TikaLambdaHandler is no longer abstract. It saves the tika-extracted text back to the S3 bucket with the same key + ".extract"
More logging
Maven: Upgraded to use aws-lambda-java* v1.1 from v1.0
Maven: Locked in to use the aws-java-sdk 1.10.27 due to incompatabilities with what's currently running on AWS Lambda (odd errors about Apache httpcore due to mis-matched versions in the AWS Lambda classpath). By specifying which version of AWS Java SDK we're on, we can force the classpath to have the version we want and avoid the runtime mis-match errors

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
src/main/java/com/dovetailsoftware/aws/lambda		src/main/java/com/dovetailsoftware/aws/lambda
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pom.xml		pom.xml

Provide feedback