I needed an endpoint to convert markdown to a standard PDF for reasons. This is how I made a simple low traffic markdown to pdf converter using AWS Lambda and pandoc. When evaluating solutions I discovered that everything either required webkit or a native LaTeX installation. All options that are painful to maintain from a user perspective for a script I was building. This solution ensures that it will keep working into the future by freezing the dependencies. AWS Lambda is really neat because it will let you put whatever you want inside of a docker image and expose it openly on the internet. The hard part is putting together all the steps and doing the docker stuff. So here’s what I did to make it.
I decided to use a debian image because I know their collection of packages is pretty create and they generally stay up to date. I also know that their version of LaTeX is generally well built and favored by academics of all ages.
The important parts are below. I didn’t bother with squashing the image or doing any fat trimming since this is meant to be quick and dirty. AWS actually has pretty decent docs on how to deploy a docker image to lambda.
FROM debian:latest
ARG FUNCTION_DIR="/function"
RUN apt update && apt upgrade
RUN apt install -y pandoc texlive python3 python3-pip
RUN mkdir -p ${FUNCTION_DIR}
WORKDIR ${FUNCTION_DIR}
RUN pip install --target ${FUNCTION_DIR} awslambdaric
COPY . ${FUNCTION_DIR}
ENTRYPOINT [ "/usr/bin/python3", "-m", "awslambdaric" ]
CMD [ "function.handler" ]
There is a minor fat trimming here by ensuring the COPY
command is the last
step. This ensures that there are fewer changes between builds and pushes to
the repo allowing a shorter dev test cycle.
Honestly github copilot wrote most of the app cause it was just write file, shell out, read file, and return. Pretty basic stuff and common patterns.
I didn’t even bother writing any cloudformation because it wasn’t necessary as a one off. You can literally go in the console and play click ops for it. I did write a basic shell script to build the image and deploy it to the lambda function.
There’s a neat trick to ensure that you wait for the update to compete before you try to deploy again.
aws lambda wait function-updated-v2 --function-name hello-pdf
Calling the Function
You can call it over a basic curl command. It just expects a markdown file to be POSTED to its endpoint.
url=$(aws lambda get-function-url-config --function-name hello-pdf --query FunctionUrl --output text)
curl -i -X POST $url -H "Content-Type: text/plain" --data-binary @TEST.md
Deploying it Yourself
You can go ahead and find the code for this application on my GitHub under a BSD 0-clause license. (Honestly any engineer who needed a similar thing packaged up would arrive at a similar solution to mine. It’s not that novel.) The README includes detailed deployment directions.
I hope you find this application useful in your endeavours. This solves a problem and may make your life easier in the long run, but who knows.