Share this article
3 GraphQL pitfalls and how we avoid them
On what was probably a foggy San Francisco day in late 2017, one of Vanta’s co-founders made a fateful commit:
That day, we did indeed serve some GraphQL requests. Since then we’ve been on a quest to get GraphQL working just how we like it.
Vanta chose GraphQL as our API layer early in our company’s history. It was the hot new thing and had a lot of potential, but industry best practices hadn’t yet been established. No one on the team had any GraphQL experience, but we had a vague notion that many ideas in operational security were best expressed through a graph. Ultimately, after some investments in tooling and culture, GraphQL ended up being the perfect tool for us.
In this post, we’ll admit some early pitfalls that we encountered in our initial implementation and explain how we make sure not to repeat the mistakes of the past.
If you want to skip to the good stuff, we’ve also open-sourced our GraphQL style guide along with a corresponding eslint plugin.
What is GraphQL?
GraphQL is a query language for APIs. It allows an API designer to define a nested schema, or “graph.” The cool thing about GraphQL is that a client can drill down to multiple levels of depth in a single query, requesting only exactly the data it wants.
A schema might look like this:
Which lets a client who doesn’t care about age fetch just the name and favorite character:
On the server side, the snippets of code that populate the requested fields are called resolvers.
Read more about GraphQL basics in the official documentation.
Pitfall #1: Not enough tooling
What we did wrong
As the API surface area grows, GraphQL development can become painful without the right tooling. Vanta defines its GraphQL schema using the Schema Definition Language rather than letting resolvers implicitly define the schema, since we want API designers to think about the schema without worrying about implementation details. But forgetting to define a resolver for a schema can cause disruptive runtime failures.
How we fixed it
We use Typescript for all of Vanta’s microservices, and we’re proud that our codebase is fully typed. GraphQL schemas are typed, but they’re not typed in Typescript. We needed to bridge the gap and somehow convert our schema into Typescript types so the typechecker could enforce the shapes of our resolvers.
Luckily, other people have also tried to convert GraphQL types to Typescript types, so we can use an off-the-shelf tool for most of the heavy lifting. We’ve had great success with GraphQL Code Generator – whenever the schema changes, we run a command that generates all of the types we need. Not only do we get resolver types, but we also generate client types and React hooks for our GraphQL client that are fully typed and ready-to-use.
We use Apollo to power our GraphQL server. Apollo generates default resolvers for every type defined in our schema, which is generally convenient. Taking an example from the Apollo docs, let’s say you have a schema that looks like this:
If you write a resolver for the books field that returns an array of objects with a title field, Apollo is smart enough to return that field by default instead of requiring a developer to implement a trivial resolver.
Unfortunately, this behavior means that Typescript cannot consistently check whether a resolver is missing or mistyped (since the default case might work). To replace Apollo’s implicit resolvers, we wrote a custom code generator that generates explicit default resolvers which can be overridden. This lets Typescript ensure that all resolvers are defined without requiring any extra boilerplate.
Between our generated resolvers and our generated types, we’ve eliminated whole classes of bugs that are now caught by the Typescript type system.
Now, any engineer can easily:
- Add a field to the schema
- See where Typescript is angry
- Fix the errors from step 2
- Rinse and repeat
Type systems aren’t the only way to increase development velocity. We also want to make sure that every change to our schema conforms to the norms that we’ve defined in our style guide without having to go through several rounds of code review.
Norms are only as good so far as they are enforced. With a quickly growing engineering team and exponentially expanding requirements, we can’t afford to have a GraphQL czar who reviews every single change to ensure that it lines up with her mental model. Instead, we’ve found that linters and other automated tools are the best solution.
Linters and autoformatters are much more effective than code reviewers when it comes to enforcing norms that are automatically checkable. Code reviewers are still expected to review changes with the style guide in mind, but the majority of rules in the style guide are enforced by our eslint plugin which runs automatically. On the flip side, if it’s too tricky to write a lint rule for some aspect of the style guide, it may suggest that the style guide is not prescriptive enough.
Pitfall #2: REST-ful GraphQL isn’t restful for devs
What we did wrong
Since GraphQL is so flexible, it’s easy to accidentally superimpose a REST-ful mindset on top of a GraphQL schema. GraphQL isn’t REST, and shouldn’t be shoehorned into traditional REST patterns.
A REST API tends to have endpoints that return data about one kind of thing along with pointers to other related data. /org/:orgId/users might return a list of each user in some organization with some metadata about those users, and /users/:userId/posts might return a list of posts for one particular user. This is easy to reason about but each endpoint tends to be a totally distinct resource with its own logic.
GraphQL allows an API designer to express an API as, well, a graph. This only makes sense when the data has a graph structure – but lots of business data does. For example, an organization might have a bunch of employees with computers, each with a list of installed applications. This could be expressed in a graph like this:
Superimposing a REST mindset on this graph works, but it requires lots of extra frontend logic and sometimes even extra network requests. In the most extreme case, a schema might look like this:
If a client wants the names of all of the applications installed on a computer in some organization, they have to make at least four round trips to the server:
- Query for the list of user Ids in the current organization
- Query for the list of computers owned by each user from the previous step
- Query for the list of applications installed on each computer from the previous step
- Query for the metadata about each application from the previous step
Of course, the API designer might add a query applicationsByOrganizationId to get this data directly, but that doesn’t solve the general problem – every time a new use case is discovered, an API designer has to add a new endpoint and write custom code to support it or the user has to make multiple round trips and do all sorts of complicated joining logic on the client side.
How we fixed it
Using GraphQL how it was intended solves this problem beautifully. Instead of returning IDs, which are basically pointers to other parts of the graph, a more reasonable schema would look like this:
Now, the client can make one straightforward query to get all the data it needs:
This also makes the server side implementation of each one of these types much easier – resolver logic only has to be implemented once per type, instead of once per business use case.
To see a more complete example, take a gander at the relevant section of our style guide.
How we enforce the fix
Of the three pitfalls discussed in this post, this is the one with the least automated enforcement, since – more than anything – it’s just a new mindset about how to design GraphQL APIs. However, we did come up with a couple of guidelines that we always look for in code review:
- Rarely offer id fields in GraphQL types. Instead, just add an edge to the whole object. If the client just needs the ID, they can query for the ID on the type itself.
- Don’t be afraid to add extra fields to some type. Unlike a traditional API where the same logic runs every time, the code backing these fields only gets executed when someone wants the field.
- There should be one type per platonic ideal of a business object. Instead of returning a “UserById” type, return a “User” type. The client decides what fields are important to them – and permissions should be enforced at a different layer.
Pitfall #3: Friendly denials of service
What we did wrong
Unlike a traditional REST API which has a finite number of possible routes, GraphQL allows a client to request arbitrary information in infinite ways. This is nice for the client but makes it hard to ensure that even a friendly client doesn’t accidentally make a request that makes a million database requests. I can neither confirm nor deny whether I accidentally wrote a query that did just that.
It’s relatively straightforward to estimate the cost of a query if you know exactly how many resources the query will return. For example, GitHub’s API docs explain how to do GraphQL costing in a pretty clever way.
However, since our GraphQL schema used to include lists of arbitrary length, it was impossible to estimate the cost of a query ahead of time. When you have a users field that returns all of the users in some organization, it’s ok when there are 100 users, but is likely to cause a problem when there are 100,000. We didn’t worry at all about pagination in the early days of Vanta, but as our customer base and complexity grew, we recognized a need for it.
How we fixed it
We wouldn’t know which queries to optimize and which code-paths are hot without monitoring. We use Datadog APM fairly heavily to monitor our GraphQL API’s performance and understand when queries are performing slower than expected. We can even see which parts of the query are taking especially long to resolve. This monitoring let us know that we should focus on two major themes: pagination and dataloaders.
There are some holy wars when it comes to GraphQL pagination, but we landed on the Relay spec since it met our needs for cursor-based pagination.
When we started using the Relay spec for pagination, we noticed that well-intentioned engineers were adding new, un-paginated fields faster than we were converting old fields to paginated versions! Once again, tooling came to our aid.
We introduced a lint rule that complains whenever we introduce a new list type that isn’t a Relay edge. We cap the number of nodes returned on each edge, so this ensures that we’re never returning lists of unbounded length.
We found, though, that not all lists need to be paginated. Sometimes, we know a list is going to be small no matter what. For those cases, we introduced a @tinylist directive which lets the linter know “this list is of small, constant-ish length, no need to paginate.”
Now, a developer who needs a new list type either must implement pagination or face the kindly wrath of a code reviewer asking why a list that is definitely not of constant length is marked as a @tinylist.
Queries often request the same data many times in the same request. Consider the following query:
If there are n users in the system and every user is friends with every other user, then a naive implementation will make n^2 database calls to serve this query, since every user needs to look up the name of each of their friends. However, since friends are shared among users, this is quite redundant; once you’ve looked up a name for some user, you shouldn’t have to look it up again in the same request.
The dataloader pattern resolves this problem. Instead of greedily making all of the expensive calls when we need them, we queue up the requests, de-duplicate them, and then make them all at once. Our key insight was that the dataloader pattern is not an “as needed” pattern – since clients can make arbitrary requests, we want to dataload nearly all the time. Wherever possible, we enforce that our resolvers use dataloaders to load the data they need. To maintain development velocity while requiring dataloading, we’ve invested in some generic higher-order functions to make it easy to convert a database query into a dataloader. Other companies have taken this a step further and autogenerated dataloaders.
Along with this blog post, we’ve open-sourced our GraphQL style guide and eslint plugin to share with weary travelers. Not all of these rules will make sense for everyone, and some probably don’t make sense to anyone. But please feel free to use them as an inspiration for your own GraphQL journey. We welcome pull requests – and if you’re interested in working with us, check out the available jobs on our jobs page!
Special thanks to Ellen Finch, Utsav Shah, Neil Patil, and the whole Vanta engineering team for their help editing this blog post and – more importantly – implementing these ideas in our product.
Determine whether the GDPR applies to you and if so, if you are a processor or controller (or both)
Do you sell goods or service in the EU or UK?
Do you sell goods or services to EU businesses, consumers, or both?
Do you have employees in the EU or UK?
Do persons from the EU or UK visit your website?
Do you monitor the behavior of persons within the EU?
Create a Data Map by taking the following actions
Identify and document every system (i.e. database, application, or vendor) which stores or processes EU or UK based personally identifiable information (PII)
Document the retention periods for PII in each system
Determine whether you collect, store, or process “special categories” of data
Determine whether your Data Map meets the requirements for Records of Processing Activities (Art. 30)
Determine whether your Data Map includes the following information about processing activities carried out by vendors on your behalf
Determine your grounds for processing data
For each category of data and system/application have you determined the lawful basis for processing based on one of the following conditions?
Take inventory of current customer and vendor contracts to confirm new GDPR-required flow-down provisions are included
Review all customer contracts to determine that they have appropriate contract language (i.e. Data Protection Addendums with Standard Contractual Clauses)
Review all in-scope vendor contracts to determine that they have appropriate contract language (i.e. Data Protection Addendums with Standard Contractual Clauses)
Have you performed a risk assessment on vendors who are processing your PII?
Determine if you need to do a Data Protection Impact Assessment
Is your data processing taking into account the nature, scope, context, and purposes of the processing, likely to result in a high risk to the rights and freedoms of natural persons?
Review product and service design (including your website or app) to ensure privacy notice links, marketing consents, and other requirements are integrated
Does the notice to the data subject include the following items?
Does the notice also include the following items?
Do you have a mechanism for persons to change or withdraw consent?
Update internal privacy policies to comply with notification obligations
Update internal privacy notices for EU employees
Determine if you need to appoint a Data Protection Officer, and appoint one if needed
Have you determined whether or not you must designate a Data Protection Officer (DPO) based on one of the following conditions (Art. 37)?
If you export data from the EU, consider if you need a compliance mechanism to cover the data transfer, such as model clauses
If you transfer, store, or process data outside the EU or UK, have you identified your legal basis for the data transfer (note: most likely covered by the Standard Contractual Clauses)
Have you performed and documented a Transfer Impact Assessment (TIA)?
Confirm you are complying with other data subject rights (i.e. aside from notification)
Do you have a defined process for timely response to Data Subject Access Requests (DSAR) (i.e. requests for information, modification or deletion of PII)?
Are you able to provide the subject information in a concise, transparent, intelligible and easily accessible form, using clear and plain language?
Do you have a process for correcting or deleting data when requested?
Do you have an internal policy regarding a Compelled Disclosure from Law Enforcement?
Determine if you need to appoint an EU-based representative, and appoint one if needed
Have you appointed an EU Representative or determined that an EU Representative is not needed based on one of the following conditions?
If operating in more than one EU state, identify a lead Data Protection Authority (DPA)
Do you operate in more than one EU state?
If so, have you designated the Supervisory Authority of the main establishment to act as your Lead Supervisory Authority?
Implement Employee Trainings to Demonstrate Compliance with GDPR Principles and Data Subject Rights
Have you provided appropriate Security Awareness and Privacy training to your staff?
Update internal procedures and policies to ensure you can comply with data breach response requirements
Have you created and implemented an Incident Response Plan which included procedures for reporting a breach to EU and UK Data Subjects as well as appropriate Data Authorities?
Do breach reporting policies comply with all prescribed timelines and include all recipients i.e. authorities, controllers, and data subjects?
Implement appropriate technical and organizational measures to ensure a level of security appropriate to the risk
Have you implemented encryption of PII at rest and in transit?
Have you implemented pseudonymization?
Have you implemented appropriate physical security controls?
Have you implemented information security policies and procedures?
Can you access EU or UK PII data in the clear?
Do your technical and organizational measure ensure that, by default, only personal data which are necessary for each specific purpose of the processing are processed?
Develop a roadmap for successful implementation of an ISMS and ISO 27001 certification
Implement Plan, Do, Check, Act (PDCA) process to recognize challenges and identify gaps for remediation
Consider ISO 27001 certification costs relative to org size and number of employees
Clearly define scope of work to plan certification time to completion
Select an ISO 27001 auditor
Set the scope of your organization’s ISMS
Decide which business areas are covered by the ISMS and which are out of scope
Consider additional security controls for business processes that are required to pass ISMS-protected information across the trust boundary
Inform stakeholders regarding scope of the ISMS
Establish an ISMS governing body
Build a governance team with management oversight
Incorporate key members of top management, e.g. senior leadership and executive management with responsibility for strategy and resource allocation
Conduct an inventory of information assets
Consider all assets where information is stored, processed, and accessible
- Record information assets: data and people
- Record physical assets: laptops, servers, and physical building locations
- Record intangible assets: intellectual property, brand, and reputation
Assign to each asset a classification and owner responsible for ensuring the asset is appropriately inventoried, classified, protected, and handled
Execute a risk assessment
Establish and document a risk-management framework to ensure consistency
Identify scenarios in which information, systems, or services could be compromised
Determine likelihood or frequency with which these scenarios could occur
Evaluate potential impact of each scenario on confidentiality, integrity, or availability of information, systems, and services
Rank risk scenarios based on overall risk to the organization’s objectives
Develop a risk register
Record and manage your organization’s risks
Summarize each identified risk
Indicate the impact and likelihood of each risk
Document a risk treatment plan
Design a response for each risk (Risk Treatment)
Assign an accountable owner to each identified risk
Assign risk mitigation activity owners
Establish target dates for completion of risk treatment activities
Complete the Statement of Applicability worksheet
Review 114 controls of Annex A of ISO 27001 standard
Select controls to address identified risks
Complete the Statement of Applicability listing all Annex A controls, justifying inclusion or exclusion of each control in the ISMS implementation
Continuously assess and manage risk
Build a framework for establishing, implementing, maintaining, and continually improving the ISMS
Include information or references to supporting documentation regarding:
- Information Security Objectives
- Leadership and Commitment
- Roles, Responsibilities, and Authorities
- Approach to Assessing and Treating Risk
- Control of Documented Information
- Internal Audit
- Management Review
- Corrective Action and Continual Improvement
- Policy Violations
Assemble required documents and records
Review ISO 27001 Required Documents and Records list
Customize policy templates with organization-specific policies, process, and language
Establish employee training and awareness programs
Conduct regular trainings to ensure awareness of new policies and procedures
Define expectations for personnel regarding their role in ISMS maintenance
Train personnel on common threats facing your organization and how to respond
Establish disciplinary or sanctions policies or processes for personnel found out of compliance with information security requirements
Perform an internal audit
Allocate internal resources with necessary competencies who are independent of ISMS development and maintenance, or engage an independent third party
Verify conformance with requirements from Annex A deemed applicable in your ISMS's Statement of Applicability
Share internal audit results, including nonconformities, with the ISMS governing body and senior management
Address identified issues before proceeding with the external audit
Undergo external audit of ISMS to obtain ISO 27001 certification
Engage an independent ISO 27001 auditor
Conduct Stage 1 Audit consisting of an extensive documentation review; obtain feedback regarding readiness to move to Stage 2 Audit
Conduct Stage 2 Audit consisting of tests performed on the ISMS to ensure proper design, implementation, and ongoing functionality; evaluate fairness, suitability, and effective implementation and operation of controls
Address any nonconformities
Ensure that all requirements of the ISO 27001 standard are being addressed
Ensure org is following processes that it has specified and documented
Ensure org is upholding contractual requirements with third parties
Address specific nonconformities identified by the ISO 27001 auditor
Receive auditor’s formal validation following resolution of nonconformities
Conduct regular management reviews
Plan reviews at least once per year; consider a quarterly review cycle
Ensure the ISMS and its objectives continue to remain appropriate and effective
Ensure that senior management remains informed
Ensure adjustments to address risks or deficiencies can be promptly implemented
Calendar ISO 27001 audit schedule and surveillance audit schedules
Perform a full ISO 27001 audit once every three years
Prepare to perform surveillance audits in the second and third years of the Certification Cycle
Consider streamlining ISO 27001 certification with automation
Transform manual data collection and observation processes into automated and continuous system monitoring
Identify and close any gaps in ISMS implementation in a timely manner
Download this checklist for easy referenceDownload Now
Determine which annual audits and assessments are required for your company
Perform a readiness assessment and evaluate your security against HIPAA requirements
Review the U.S. Dept of Health and Human Services Office for Civil Rights Audit Protocol
Conduct required HIPAA compliance audits and assessments
Perform and document ongoing technical and non-technical evaluations, internally or in partnership with a third-party security and compliance team like Vanta
Document your plans and put them into action
Document every step of building, implementing, and assessing your compliance program
Vanta’s automated compliance reporting can streamline planning and documentation
Appoint a security and compliance point person in your company
Designate an employee as your HIPAA Compliance Officer
Schedule annual HIPAA training for all employees
Distribute HIPAA policies and procedures and ensure staff read and attest to their review
Document employee trainings and other compliance activities
Thoroughly document employee training processes, activities, and attestations
Establish and communicate clear breach report processes
to all employees
Ensure that staff understand what constitutes a HIPAA breach, and how to report a breach
Implement systems to track security incidents, and to document and report all breaches
Institute an annual review process
Annually assess compliance activities against theHIPAA Rules and updates to HIPAA
Continuously assess and manage risk
Build a year-round risk management program and integrate continuous monitoring
Understand the ins and outs of HIPAA compliance— and the costs of noncompliance
Download this checklist for easy referenceDownload Now
FEATURED VANTA RESOURCE
The ultimate guide to scaling your compliance program
Learn how to scale, manage, and optimize alongside your business goals.