Category Archives: Serverless

Creating a Deliverable HTML Email on AWS Lambda with SES

Creating deliverable rich html emails is a great goal for many web applications, communicating with your customers, and helping to send the messages that are beautiful and make your marketing/designers happy.

As discussed in this send grid article, there are quite a number of approaches for doing this, but one of the best options for ensuring that images in emails with images. The leaders today in 2017 are using cid and referencing attachments, and using data urls.

According to the comments in this Campaign Monitor blog post, the cid method is supported by all the major clients today. (ironically the post is talking about using the cooler approach of data urls for images).

With this background, the question is how to do this with AWS Lambda and SES.

Thankfully it’s really straight forward.

The simple steps are:

  • create a simple html email that references images using cid: as the protocol.
  • create a raw rfc822 email string that can be sent with the SES api.
  • use the ses.sendRawEmail method to send the email.

1 Create a simple html email

For example:

<html><body><p>Hello world</p><img src=”cid:world”></body></html>

Note that the source of the image is of the format cid:world, this cid will be what you specify when attaching the image to the blog post

2 Create a raw rfc822 email string

The mailcomposer  package a part of Nodemail(https://nodemailer.com/extras/mailcomposer/) provides a great simple easy to use api for creating rfc822 emails with attachments. When creating attachments you can specify cids to refer to them by, and you can specify the contents of the attachment with a local filename, a buffer, or even a http resource. It’s a great api.  Take a look at the npm page to see more. One example of using this package is:

let from = 'from@example.com';
let to = 'to@example.com';
let subject = 'Subject';
let htmlMessage = '<html><body><p>Hello world</p><img src="cid:world"></body></html>';
let mail = new MailComposer({
  from: from, to: to, subject: subject, html: htmlMessage,
  attachments: [{
    filename: 'hello-world.jpg',
    path: 'https://cdn.pixabay.com/photo/2015/10/23/10/55/business-man-1002781_960_720.jpg',
    cid: 'world'
  }]
});
mail.build(function(err, res) {console.log(res.toString())});

3 Send the email with SES

Take the buffer that you create and send it with SES.

let sesParams = {
  RawMessage: {
    Data: message
  },
};
ses.sendRawEmail(sesParams, function(err, res){console.log(err, res)});

Full example using promises

Let’s put it all together, and pull in some of the promise code that I talked about in an earlier blog post(http://www.rojotek.com/blog/2017/04/11/create-a-promise-wrapper-for-a-standand-node-callback-method/)

function createEmail(){
  let from = 'from@example.com';
  let to = 'to@example.com';
  let subject = 'Subject';
  let htmlMessage = '<html><body><p>Hello world</p><img src="cid:world"></body></html>';
  let mail = new MailComposer({
    from: from, to: to, subject: subject, html: htmlMessage,
    attachments: [{
      filename: 'hello-world.jpg',
      path: 'https://cdn.pixabay.com/photo/2015/10/23/10/55/business-man-1002781_960_720.jpg',
      cid: 'world'
    }]
  });

  return new Promise((resolve, reject) => {
    mail.build(function(err, res) {
      err ? reject(err) : resolve(res);
    });
  });
}
createEmail().then(message =>{
  let sesParams = {
    RawMessage: {
      Data: message
    },
  };
  return ses.sendRawEmail(sesParams).promise();
});

Creating emails that include attachments is really quite easy with node, lambda and ses. Doing this is a great step to delivering rich emails that look like what your designers want.

 

9 Things I learnt while moving data from RedShift into AWS Elastic Search with AWS Lambda

The amazon infrastructure is amazing and allows for interesting and cool scaling without the use of servers. It’s exciting to see what can be done. The trick with much of this is that many of the elements are asynchronous and so it can be easy to flood services, particularly when pulling data out of your RedShift data warehouse and putting it into Elastic Search. I’ve learnt a bunch of things while doing this, the salient points are below.

  1. Don’t gzip the data unloaded.
  2. Use the bulk load on elastic
  3. Use a large number of records in the bulk load (>5000) – fewer large bulk loads are better than more smaller ones. When working with AWS elastic search there is a risk of hitting the limits of the bulk queue size.
  4. Process a single file in the lambda and then recursively call the lambda function with an event
  5. Before recursing wait for a couple of seconds –> setTimeout.
  6. When waiting make sure that you aren’t idle for 30 seconds because your lambda will stop.
  7. Don’t use s3 object creation to trigger your lambda — you’ll end up with multiple lambda functions being called at the same time.
  8. Don’t bother trying to put kinesis in the middle – unloading your data into kinesis is almost certain to hit load limits in kinesis.
  9. Monitor your elastic search bulk queue size with something like this:
    curl https://%ES-SERVER:PORT%/_nodes/stats/thread_pool |jq ‘.nodes |to_entries[].value.thread_pool.bulk’

1 Unloading from RedShift

The process of doing the gunzip in the lambda takes time + resources in the lambda function. Avoid this by just storing the CSV in s3 and then streaming it out with S3.getObject(params).createReadStream().

Here is an unload function that works well for me.

UNLOAD ('%SOME_AMAZING_QUERY_FROM_A_BIG_TABLE%')
TO 's3://%BUCKET%/%FOLDER%'
credentials 'aws_access_key_id=%AWS_KEY_ID%;aws_secret_access_key=%AWS_ACCESS_KEY%'
DELIMITER AS ',' NULL AS '' ESCAPE ADDQUOTES;

2 Use the bulk load in elastic

The elastic bulk load operation is your friend. Don’t index each record one at a time and consume lots of resources, instead send up batches at the same time using the bulk operation.

3 Use a large number of records in the bulk load

More than 5000 records at a time in the bulk load is important to do. Fewer big loads is better than more small ones. See https://www.elastic.co/guide/en/elasticsearch/guide/current/bulk.html#_how_big_is_too_big for setting the number and size of this.

4 Process a single file in each lambda function

To ensure that you don’t consume too many resources, process a single file in each lambda function then recurse using

Lambda.invoke({
    FunctionName: context.invokedFunctionArn,
    InvocationType: 'Event',
    Payload: JSON.stringify(payload)
});

Use either the promise version or callback version as preferred. Keep track of where you are in the payload.

5 Wait before recursing

Before doing the callback above wait a couple of seconds to give elastic a chance to catch up.
setTimeout(function(){recurseFunction(event, context, callback)}, 2000);

6 Keep the wait short

If you don’t do anything for 30 seconds Lambda will timeout. Keep the wait short. 2 seconds (as chosen above) wasn’t completely arbitrary.

7 Don’t use s3 object creation to trigger your lambda

One of the things we are seeing consistently is trying to control the rate of data flowing into elastic. Using the s3 object creation triggers for the lambda will result in multiple concurrent calls to your lambda function. This will result in too much at the same time. Trigger the lambda some other way.

8 Kinesis isn’t the answer to this problem

Putting the records to index into kinesis will not act as a good way to control the massive flow of data from redshift to elastic. While kinesis is great for controlling streams of data over time, it’s not really the right component for this scenario of loading lots of records at once. The approach outlined throughout this document is suitable.

9 Monitor your elastic resources with curl and jq

Unix commandline tools rock.

curl and jq are great tools for working with http data. curl for getting data, jq for processing json data.(https://stedolan.github.io/jq/)

elastic provides json apis for seeing the data. The below command is how to look up the information on the bulk queue size.

curl https://%ES-SERVER:PORT%/_nodes/stats/thread_pool |jq '.nodes |to_entries[].value.thread_pool.bulk'

Conclusion

Serverless + the AWS stack is nice — you need to think about how to use it and knowing the tools + capabilities of the platform is important — with care you can do amazing things. Go build some great stuff.