Roundtable on Behavior Driven Development

Posted by Albert Gareev on Jun 10, 2016 | Categories: CommunityNotesWTToronto

Re: Toronto Testing Meetup, May Workshop: Roundtable on Behavior Driven Development


And so it is claimed:

When launching a new digital project, there can be a disconnect between:

• the business being truly able to define the desired outcomes

• the developer’s understanding of what needs to be built, and

• the business’ understanding of the technical challenges their requirements may present.

Behavior Driven Development (BDD) can help achieve all of the above and ultimately, helps a business and its technical team deliver software that fulfills business goals.


We’re going to put that under scrutiny.

The Crew

The Agenda

  • “Trying out BDD” – experience report presented by Gauri Nayyar
  • Identified mentioned benefits and problems. Discussed how the problems could be avoided or fixed
    • Including technical issues (Could the implementation of BDD been done better with another approach, another language, another tool, etc.)
    • Including process and people issues (Business, Workflow. What were the reasons for implementing BDD, were they accurately measured, etc.)
  • Questioned the BDD approach, the benefits it promises, talked costs and risks.
  • Considered alternative methods for solving of the same problems.

Trying Out BDD

Experience report presented by Gauri Nayyyar

(Image by Gauri Nayyyar)



Notes captured by Albert Gareev.

  • Testing team felt a lot of pressure keeping up with regression testing during the frequent releases in the agile shop
  • There was a disconnect between how the requirements were given out by Business and understood by programmers and testers
  • BDD approach seemed very promising to solve the mentioned problems
  • Testing team took the initiative to pilot the Behavior Driven Development approach
  • As a managerial decision, the main focus was set on
    • Creating the DSL (Domain Specific Language) library in Gherkin syntax for reusable Acceptance Tests
    • Converting the existing test case suite (in a form of Excel sheets) to Gherkin-based scenarios
  • Immediate challenges
    • The “ubiquitous language” apparently is still easy to misunderstand or misuse.
    • Naming convention challenges: do you say “Log in” or “Sign in”?
    • BA and Dev expected that QA will write all the scenarios
    • Huge backlog of the existing test cases
    • Lack of time to do the pilot when the team is already struggling to keep up the actual testing
  • Problem: what takes 2 hours just to test took 8 hours to write Gherkin scenarios for testing of the same
  • As in the image presented by Gauri, the process still looked a lot like a Waterfall model scaled down to a sprint

The Discussion

Discussion notes by David Tangness and Albert Gareev.

Benefits of BDD

  • Get everyone on the same page, develop shared understanding and ownership of the business purpose
  • Testers and Developers may raise and solve concerns which business/client people didn’t think of, and get them addressed early / at low cost

[That’s our immediate points. There may be more, in general or for particular projects.]

Brainstorming Points

  • People may miss the ‘central’ conversation (distributed team, busy that day, moved from another project, newly hired, etc).
    • The fact that tests are written in DSL format means documents should be relatively easy to read afterwards / later. Is it?
    • On the other hand, is anyone getting value out of reading a long list of gherkin scenarios?
  • When expressing requirements in Gherkin syntax, how do you limit the number of scenarios?
    • It seems like Gherkin scenarios are supposed to map to business rules, so the number of scenarios doesn’t exceed what you’d have if you just wrote down all the business rules. But then how do we use the scenarios?
    • For any input/output, how do you figure out if it satisfies the scenarios? You need to break the input/output combo into relevant factors, and compare each factor against scenarios. But you can only do that if you know which business rules map to each scenario. So what was the point of the scenario in the first place?
    • Takeaway. Need to see some examples of “how people choose / limit their Gherkin scenarios.” It’s not clear that creating Gherkin scenarios is superior to just recording business rules directly.
  • Idea. BDD and Gherkin are separable, we should distinguish between benefits/failings of BDD approach and Gherkin scripts.
    • Takeaway. Are there any examples of BDD that are NOT using Gherkin syntax?
    • If we can figure out the benefits/drawbacks of Gherkin then we should be able to invent an alternative. Surely there must be some alternatives published somewhere?
  • Particular failures with the DSL creation
    • People used different words to describe the same (“start page – home page”, “log in – sign in”)
    • Writing tests in Gherkin syntax takes a long time (Feels like just “more work”, “pointless”)
    • Whose responsibility is it to create the DSL library? (Business and Dev weren’t inclined to participate)
    • To auto-generate the acceptance scenarios another tool was tried but that added to confusion and to the workload

[Conclusion: Hold off on speculating about other issues. These feel like common issues, but surely lots of teams have been successful in creating a DSL. It’ll be easier to think about solutions after we know what success looks like.]

Reviewing Problems

Converting legacy test cases when adopting BDD

  • Question: is that a real problem? Test cases were required for scripted regression testing. But there are more effective and efficient ways to do that than with scripts.
  • Challenge. How do you handle the flood of old test cases which need to be converted?
  • Challenge. How do you keep up your testing progress AND effectively do conversion?


  • Review test coverage, drop redundant test cases
  • Prioritize test cases, convert more important first
  • Make conversion tasks part of the team back log and treat accordingly
  • DO NOT convert at all, as BDD is NOT for regression testing

Overhead of maintaining Gherkin files

  • Question: what is the purpose of those Gherkin files? Who will use them after they were written?
    • If you can’t find a clear beneficiary, then maybe you shouldn’t be maintaining Gherkin files at all
    • Takeaway. Think about who benefits from those files. Weigh their benefit against the costs of maintenance.

Onboarding new people into a BDD environment

  • This seems like a part of general problem transitioning to Agile
  • Learning Gherkin and DSL is not that difficult

[Stopping for now to re-focus]

“This Is the House That Jack Built”: Practicing writing Gherkin scenarios


A web page which upon opening displays “Hello, World”

First pass

Given I am a user
When I navigate to the URL
Then “Hello, World” is displayed

Second pass

Given I am a user
When I navigate to the URL
Then “Hello, World” is displayed
And the text is in the center of the screen
And the text color is yellow

Third pass

Given I am a user
And my device is <some supported device>
And my browser is <some supported browser>
When I navigate to the URL
Then “Hello, World” is displayed
And the text is in the center of the screen
And the text color is yellow


  • We need to “parameterize” Gherkin with sets like “supported devices” and “supported browsers”
    • And there certainly should be a logic when the device or browser are not among the supported
  • If we continue adding details, will the scenario end up being too long?
  • Should we split it into multiple scenarios?
    • E.g. Have separate “reusable” scenarios.
  • We need many analog details which will take a very, very long time to communicate in text.
    • How do we express this in Gherkin? “A picture is worth a thousand words”.

Fourth pass

Given I am a user
And my device is <some supported device>
Aand my browser is <some supported browser>
When I navigate to the URL
Then “Hello, World” is displayed
And the text is in the center of the screen
And the text color is yellow
And the text is in capitals
And the text is in 16pt.


  • Then-block by now begs to become a reusable module

Fifth pass

Given I am a user
And my device is <some supported device>
And my browser is <some supported browser>
When I navigate to the URL
Then “Hello World” is displayed
And the text is in the center of the screen
And the text color is yellow
And the text is in capitals
And the text is in 16pt.
And the user’s IP is captured.


  • We gonna need another reusable scenario for IP thing

Sixth pass

Given I am a user
And my device is <some supported device>
And my browser is <some supported browser>
When I navigate to the URL
Then HelloText is displayed
And the text is in the center of the screen
And the text colour is yellow
And the text is in capitals
And the text is in 16pt.
And the user’s IP is captured.

Defining modules

“HelloText” would be another action word, defined in Gherkin. Examples defining HelloText for different locations:

Given the user is from Canada
Then HelloText is “Hello, World. Eh?”

Given the user is from North America
Then HelloText is “Hello, World”

Given the user is from Germany
Then HelloText is “Hallo, Welt

….. and so on. Nobody will want to read this.

  • Concern – Capturing the different translations of the message in this format will produce many, many scenarios.
  • Concern – Overlapping business conditions (Canada is part of North America)
  • Idea: DO NOT capture information in Gherkin if it better fits table, picture, or formula formats
  • Idea: Cover such requirements at unit testing level
    • But from a business perspective we do need to capture the requirement, and if Gherkin is how we’re capturing business requirements then we have no choice.

[Takeaway: A single feature with many details can produce long or numerous scenarios. Need to research how do BDD practitioners solve this.]


[We have learned a lot from the practical exercise. Let’s debrief now and capture the points.]

It is fun

  • Group writing of these Gherkin scenarios indeed feels like fun. It is a creative and social exercise.
  • On the other hand, it feels like it has little to do with the actual work. Programmers will still need to write code, and testers – to to test.

It’s not that simple

  • What looks like an extremely simple requirement (Display “Hello, World”) becomes a multi-line statement when we take a close look. And notice: we didn’t even discuss cases when something goes wrong.
  • It actually becomes time consuming to write all of the and-blocks.

We can’t base testing only on Gherkin scenarios

  • The scenarios are a kind of a “happy path” description. Things always go wrong, and we need to test around a lot.
  • There are a lot of assumptions that most likely won’t be pre-scripted in Gherkin scenarios. We need to discover and evaluate them.


  • Not all scenarios are automatable.
  • Checking different conditions ought to be done at different levels: unit, API, GUI, through integrated products. That means, a single scenario will be torn apart anyway, and tested in a distributed  way.
  • Writing all these repetitive scenarios requires an automation on its own.
  • Writing an interpreter code for all these scenarios becomes a serious task on its own. And testers may not have deep programming skills to accomplish it.

Modeling and mindset

  • “Ubiquitous language” turned up rather poor as a universal modeling tool: some information better described with a sketch, some as a table, some as a formula.
  • It requires certain style of thinking if you want to represent everything with Gherkin. At times, it feels unnatural. Would stakeholders even agree to describe everything in Gherkin terms?

Refactoring and maintenance

  • How do you refactor Gherkin? If you have modular scenarios built upon each other, then when you refactor one you must refactor the rest.
  • Same goes for the automation interpreting Gherkin

[End of debriefing]

Closing Notes

Values and problems

We all agreed that trying to write Gherkin scenarios spurred a lot of questions and ideas. The actual artifacts weren’t perceived as important. In fact, after the third pass writing them felt as a nuisance.

In this sense, BDD / ATDD do seem to be adding value. But it feels like we need to “extract” those benefits a leave the burden.

Core value: the meeting of clients / customers / developers / testers, covering each others blind spots and ensuring everybody is “on the same page”.

“Would you try it?”

The opinions were split. No one seem wanted to try implementing BDD at the full extent, with conversion of all requirements and all tests to the Gherkin scenarios.

Some still maintained a belief that BDD would enable automation that will bring a regression testing relief. Other remained skeptical.

“What are the alternatives?”

As a further expansion, the participants agreed to explore alternative approaches that promise the same value as BDD but don’t introduce the problems and expenses.

In particular, we decided to dedicate the next session to Gerald Weinberg’s books

  • 4 responses to "Roundtable on Behavior Driven Development"

  • Lance Kind
    10th June 2016 at 19:29

    This is a good honest experience report. Some early adopters mistakes happened (common unfortunately):
    * BDD scenarios weren’t collaborated on from the source of requirements, BA, QA, and devs.
    * BDD scenarios weren’t written at the business level (so of course the business user ran away when thinking of all these detailed scripts). The point of BDD is to get away from writing imperative specifications that attempt to specify everything which is impossible.

    BDD is a cross organizational process. In this case, because only QA was involved, the process turned form being *Behavior* Driven Development into using Gherkin to do make automated functional tests.

    Unfortunately, many people fall into this trap. I wish I had an internet erasure every time I see a website teaching “how to do BDD” and they got it all wrong. This inspired the content on

    At a minimum, BDD should encompass the BA, the Devs and QA before I’d even bother to use this process. Without involving the business, you’ll still have “requirement misses” because they are necessary for the process to work well. A business person needs to collaborate with such a team at least 2 times a week (Backlog Grooming, Sprint Planning, meeting with the BA to prepare a first cut) and of course highly functioning teams that deliver with little to no requirements misses have a business person (someone who uses the product or manages those who do) touching base (standup meeting) every day.

    If you *only* want automated functional testing, then BDD is the wrong process. ATDD is better (no Gherkin. QA/Devs write automated tests for acceptance criteria for every story you’re building in that Sprint.)

  • Griffin Jones
    13th June 2016 at 5:48

    To the degree that it helps create / encourage a structured conversation around scenarios between team members – it is good.

    To the degree that it creates a simple to understand, fast to run, easy to maintain, subset of the most basic scenarios that – everyone agrees are important, and should always work ‘correctly’ – it is good.

    To the degree that there is not clear stopping criteria, that the number of scenarios is infinite, that adding one more scenario is perceived as free (Weinberg’s White Bread Warning), the tendency of the tool to reshape your mind and how we think of the testing problem (when you have a hammer …; and Gherkin passing = testing done = ship it = lack of fear of the false negative ), and the ease in which it is to slip beyond your ability to master and apply the tool- versus becoming a servant of the tool – that is problem.

    For me, I prefer to focus on ATDD versus BDD. I prefer on getting the conversations right, then avoiding a conversation by using of a formal detailed grammar tool.

  • Griffin Jones
    13th June 2016 at 5:52

    BTW – Great experience report and discussion. I feel your BDD pain. Where I have used it in the past (can think of three examples), it went from adoption to abandonment/refactored into ATDD in two years.

  • Paul Grizzaffi
    13th June 2016 at 21:51

    Great report.

    I like the opening where you mention the disconnects from the BDD guide.

    The report then quickly dives into matters of the automation. Valid considerations to be sure, but I find them to be a bit premature.

    In my experience, the really hard stuff is changing a team’s culture to adopt and adapt to BDD or any “test first approach” (which is truly and unfortunate category name). If you can’t get the culture right, the BDD automation frameworks can be cumbersome and not worth the overhead. See my article here for an overview of my thoughts:

    Griffin’s and Lance’s comments above ring true for me, though I have typically used BDD and ATDD interchangeably; it appears I need to do some more homework on the distinction.

  • Leave a Reply

    * Required
    ** Your Email is never shared

Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported
This work by Albert Gareev is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported.