Thursday, April 25, 2024
HomeRuby On RailsWorking with Markdown in PHP

Working with Markdown in PHP


Markdown is a markup language that’s fairly helpful for net builders. It may be used for writing technical documentation, blogs, books, and even writing feedback on web sites, comparable to GitHub.

On this article, we’ll check out what Markdown is, the advantages of utilizing it, and easy methods to convert Markdown to HTML utilizing PHP. We’ll additionally cowl how one can create your individual CommonMark PHP extensions so as to add new options and syntax to your Markdown recordsdata.

What’s Markdown?

Earlier than we contact any code, let’s first check out what Markdown is, its historical past, and some totally different examples of how you should use it.

Markdown is a markup language that you should use to create formatted textual content, comparable to HTML. For instance, in a Markdown file, you’ll be able to write # Heading 1, which will be transformed to the next HTML: <h1>Heading 1</h1>. It permits you to write, for instance, documentation with out prior data of the supposed output format (on this case, HTML). It permits you to create different components, comparable to the next:

  • ## Heading 2 which might output: <h2>Heading 2</h2>

  • **Daring textual content** which might output: <sturdy>Daring textual content</sturdy>

  • _Italic text_ which might output: <em>Italic textual content</em>

You may even write tables like this:

| Identify  | Age | Favourite Colour |
|-------|-----|------------------|
| Joe   | 30  | Pink              |
| Alice | 41  | Inexperienced            |
| Bob   | 52  | Blue             |

This desk could be output as the next HTML:

<desk>
    <thead>
        <tr>
            <th>Identify</th>
            <th>Age</th>
            <th>Favourite Colour</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>Joe</td>
            <td>30</td>
            <td>Pink</td>
        </tr>
        <tr>
            <td>Alice</td>
            <td>41</td>
            <td>Inexperienced</td>
        </tr>
        <tr>
            <td>Bob</td>
            <td>52</td>
            <td>Blue</td>
        </tr>
    </tbody>
</desk>

Markdown was initially created in 2004 by John Gruber and Aaron Swartz, with the primary objective of readability. They supposed for it to be a markup language that was simple to know with out it being rendered. For instance, generally, you’ll be able to clearly see the contents of a desk in Markdown (like within the instance proven above) with out having to transform and render it first. Whereas, viewing a desk in HTML is not all the time as simple to know at first look.

When Markdown was first created, the preliminary specification for the language was constructed round a syntax description and a Perl script (referred to as Markdown.pl). You might run your Markdown contents by way of the script, and it might output HTML. Nevertheless, the preliminary script had some bugs and ambiguities within the unique syntax description. Therefore, because the script bought ported over to totally different languages and software program, it resulted in lots of implementations. This meant that working your Markdown contents by way of one converter would doubtlessly result in totally different output than working it by way of one other converter.

Due to this fact, to deal with this situation, a specification named CommonMark was launched in 2014. In CommonMark’s personal phrases, it is a “strongly outlined, extremely appropriate specification of Markdown”. The specification goals to take away ambiguity in order that no matter which CommonMark-compatible script you employ to transform Markdown, the output is all the time the identical.

CommonMark is utilized by varied websites, comparable to GitHub, GitLab, Reddit, Discourse, Stack Overflow, and Stack Trade. Thus, every time you’re writing Markdown on one these websites, they will convert it utilizing the positioning’s specification. Though, it’s price noting that a few of these, comparable to GitHub, use their very own “taste” of Markdown. For instance, GitHub makes use of “GitHub Flavored Markdown” (GFM) which is a superset of CommonMark with additional choices (normally known as extensions). Therefore, you should use the prevailing CommonMark options, however with added enrichment. To present this a little bit of context, we’ll take a fast have a look at an instance of one thing that’s supported in GFM however not within the common CommonMark specification:

GFM permits you to add strikethrough textual content:

~~That is strikethrough~~. This isn't.

Utilizing GFM, this is able to outcome within the following output:

<del>That is strikethrough</del>. This isn't.

The Advantages of Utilizing Markdown

As a developer, utilizing Markdown will be fairly useful. It may be used when writing documentation for you mission, package deal, or library. You too can use it for technical writing, comparable to for a weblog. In reality, for those who’ve ever learn over a “README” file on GitHub for a package deal that you just’re utilizing in your mission, it has been written utilizing Markdown.

As we have already seen above, Markdown can assist present semantic that means to your content material; and usually, it does not have to be rendered so that you can perceive it. That is helpful when a number of individuals are contributing to a file as a result of familiarity with the styling of the output just isn’t required. For instance, the contents of the Laravel documentation are contained inside a repository on GitHub (laravel/docs). It is fully open for anybody to contribute to it without having to find out about CSS courses or styling that the positioning will use throughout rendering. Because of this anybody acquainted with Markdown can bounce proper in and begin contributing with a minimal quantity of blockers.

One other important good thing about utilizing Markdown is its typically platform-agnostic nature. Have you ever ever created a doc in Microsoft Phrase, opened it in Google Docs, and located that the doc appears to be like totally different? Possibly the tables aren’t the identical dimension? Alternatively, the textual content that goes completely to the top of the web page in Phrase overflows onto the following web page in Google Docs? Markdown reduces the probability of those points by solely worrying about construction, not styling. As a substitute, the styling would sometimes be positioned on the HTML output.

As a result of Markdown normally solely defines the construction and content material slightly than the styling, Markdown will be transformed into many codecs. Due to this fact, you’ll be able to convert contents to each HTML and different codecs, comparable to PDF, EPUB, and MOBI. You would possibly wish to use these codecs for those who’re utilizing Markdown to jot down an e-book that can be learn on e-readers.

Rendering Markdown in PHP Utilizing CommonMark in PHP

Now that we have taken a quick have a look at what Markdown is and a few of its advantages, let’s discover methods to make use of it in our PHP tasks.

To render Markdown recordsdata, we’ll use the league/commonmark package deal. You may learn the complete documentation for the package deal right here.

Putting in the Bundle

To put in the package deal utilizing Composer, we’ll use the next command:

composer require league/commonmark

Fundamental Utilization

After you have put in the package deal, you can render HTML:

use LeagueCommonMarkCommonMarkConverter;

$output = (new CommonMarkConverter())->convert('# Heading 1')->getContent();

// $output can be equal to: "<h1>Heading One</h1>"

As you’ll be able to see, it’s extremely simple to make use of!

The package deal additionally comes with a GithubFlavoredMarkdownConverter class that we are able to use to transform Markdown to HTML utilizing the “GitHub Flavored Markdown” taste. We are able to name it precisely the identical because the CommonMarkConvert class:

use LeagueCommonMarkGithubFlavoredMarkdownConverter;

$output = (new GithubFlavoredMarkdownConverter())->convert('~~That is strikethrough~~. This isn't.')->getContent();

// $output can be equal to: "<del>That is strikethrough</del>. This isn't."

It is price noting that calling the convert methodology returns a category that implements the LeagueCommonMarkOutputRenderedContentInterface interface. In addition to with the ability to name the getContent methodology to get the HTML, you may also forged the item to a string to attain the identical output:

use LeagueCommonMarkGithubFlavoredMarkdownConverter;

$output = (string) (new GithubFlavoredMarkdownConverter())->convert('~~That is strikethrough~~. This isn't.');

// $output can be equal to: "<del>That is strikethrough</del>. This isn't."

Configuration and Safety

By default, the CommonMark PHP package deal was designed to be 100% compliant with the CommonMark specification. Nevertheless, relying in your mission and the way you are utilizing Markdown, you would possibly wish to change the configuration used for conversion to HTML.

For instance, if we wished to forestall <sturdy> HTML tags from being rendered, we may set our configuration and go it to our convert:

use LeagueCommonMarkCommonMarkConverter;

$config = [
    'commonmark' => [
        'enable_strong' => false,
    ]
];

$output = (new CommonMarkConverter($config))->convert('**This textual content is daring**')->getContent();

// $output can be equal to: "Heading One"

As you’ll be able to see, we outlined the config in a $config variable after which handed it to the CommonMarkConverter‘s constructor. This resulted within the output textual content not being included the <sturdy> tag.

We are able to additionally use the configuration to enhance the safety of our output HTML.

For instance, we could say that we have now a weblog, and we enable readers to touch upon the weblog posts utilizing Markdown. Due to this fact, every time a reader hundreds a weblog put up of their browser, the feedback may even be displayed. As a result of Markdown can embody HTML in it, malicious feedback may create a cross-site scripting(XSS) assault.

To present this some context, let’s check out how the CommonMark PHP package deal converts by default:

use LeagueCommonMarkCommonMarkConverter;

$output = (new CommonMarkConverter())->convert('Earlier than <script>alert("XSS Assault!");</script> After')->getContent();

// $output can be equal to: "Earlier than <script>alert("XSS Assault!");</script> After"

As you’ll be able to see, the <script> tags weren’t eliminated or escaped! Thus, if this was rendered in a consumer’s browser, no matter is contained in the <script> tags can be run.

To stop this from occurring once more, you’ll be able to take two totally different approaches: escape the HTML, or take away it altogether.

To start out, we may escape the HTML by setting the html_input configuration choice to escape:

use LeagueCommonMarkCommonMarkConverter;

$output = (new CommonMarkConverter(['html_input' => 'escape']))->convert('Earlier than <script>alert("XSS Assault!");</script>')->getContent();

// $output can be equal to: "Earlier than &lt;script&gt;alert("XSS Assault!");&lt;/script&gt; After"

Alternatively, if we wished to fully take away the HTML, we may set the html_input configuration possibility as strip:

use LeagueCommonMarkCommonMarkConverter;

$output = (new CommonMarkConverter(['html_input' => 'strip']))->convert('Earlier than <script>alert("XSS Assault!");</script>')->getContent();

// $output can be equal to: "Earlier than  After"

For a full checklist of the configuration and safety choices that the CommonMark PHP package deal presents, you’ll be able to take a look at the documentation.

Utilizing CommonMark PHP Extensions

One of many cool issues concerning the CommonMark package deal is that it permits you to use extensions to counterpoint Markdown by including new syntax and options that the parser can use.

The package deal ships with 18 extensions out-the-box that you should use instantly in your tasks. To point out you easy methods to make use of certainly one of these extensions, we’ll check out easy methods to use the “Desk of Contents” extension so as to add a desk of contents to our output HTML.

To start out, we’ll must outline our config utilizing a table_of_contents discipline and go it to a brand new Markdown atmosphere in order that we are able to convert out Markdown:

use LeagueCommonMarkEnvironmentEnvironment;
use LeagueCommonMarkExtensionCommonMarkCommonMarkCoreExtension;
use LeagueCommonMarkExtensionHeadingPermalinkHeadingPermalinkExtension;
use LeagueCommonMarkExtensionTableOfContentsTableOfContentsExtension;
use LeagueCommonMarkMarkdownConverter;

// Outline our config...
$config = [
    'table_of_contents' => [
        'html_class' => 'table-of-contents',
        'position' => 'placeholder',
        'placeholder' => '[TOC]',
    ],
];

// Create an atmosphere utilizing the config...
$atmosphere = new Surroundings($config);

// Register the core CommonMark parsers and renderers...
$atmosphere->addExtension(new CommonMarkCoreExtension());

// Register the Desk of Contents extension (this extension requires the HeadingPermalinkExtension!)
$atmosphere->addExtension(new HeadingPermalinkExtension());
$atmosphere->addExtension(new TableOfContentsExtension());

$output = (new MarkdownConverter($atmosphere))
    ->convert(file_get_contents(__DIR__.'/markdown/article.md'))
    ->getContent();

In our $config discipline that we handed to the atmosphere, we have outlined that anyplace the parser sees [TOC] within the Markdown, it is going to place a desk of contents and provides it a CSS class of table-of-contents. Utilizing a CSS class permits us to fashion the desk to suit our supposed web site’s design. As a facet notice, by default, the extension will use a worth of prime for the place, which is able to place the desk of contents immediately on the prime of the output without having to incorporate a placeholder (e.g., [TOC]). We have additionally added the HeadingPermalinkExtension extension as a result of the TableOfContentsExtension extension requires it to generate hyperlinks from the desk of contents to the associated headings.

To see the checklist full checklist of choices that this extension supplies, you’ll be able to take a look at the extension’s documentation.

Lets say that the article.md file that we handed to the converter contained the next contents:

[TOC]

## Programming Languages

### PHP

### Ruby

### JavaScript

This is able to outcome within the following HTML output:

<ul class="table-of-contents">
    <li>
        <p><a href="#programming-languages">Programming Languages</a></p>
        <ul>
            <li>
                <p><a href="#php">PHP</a></p>
            </li>
        </ul>
        <ul>
            <li>
                <p><a href="#ruby">Ruby</a></p>
            </li>
        </ul>
        <ul>
            <li>
                <p><a href="#javascript">JavaScript</a></p>
            </li>
        </ul>
    </li>
</ul>

<h2 id="programming-languages">Programming Languages</h2>

<h3 id="php">PHP</h3>

<h3 id="ruby">Ruby</h3>

<h3 id="javascript">JavaScript</h3>

As you’ll be able to see, it’s extremely simple to get began with utilizing extensions within the CommonMark package deal. The best good thing about utilizing these extensions is which you can enrich your HTML without having an excessive amount of handbook intervention. Nevertheless, it is essential to do not forget that if you’ll be sharing this Markdown file in a number of locations, try to be cautious with what (if any) extensions you employ. For instance, for those who write a weblog put up in Markdown after which cross-post it to many websites, they possible will not help additional options that you have added to your individual website, comparable to including a desk of contents. Nevertheless, for those who’re utilizing Markdown on your personal functions, comparable to constructing a documentation website, the extensions will be extraordinarily highly effective.

Creating Your Personal CommonMark PHP Extensions

Now that we have checked out easy methods to use the CommonMark package deal together with extensions, let’s check out easy methods to create our personal extensions. For the aim of this text, we’ll think about that we have now a documentation website and that we wish to have “warning” sections to warn builders of widespread errors or safety vulnerabilities. Due to this fact, we’ll say that anyplace we see {warning} in our code, we’ll wish to output a warning within the HTML.

First, to create the extension, we have to create a category that implements the CommonMark package deal’s LeagueCommonMarkExtensionExtensionInterface interface. This class will solely comprise a single register methodology that accepts an occasion of LeagueCommonMarkEnvironmentConfigurableEnvironmentInterface. Therefore, the boilerplate of the category will appear like this:

namespace AppMarkdownExtensions;

use LeagueCommonMarkEnvironmentEnvironmentBuilderInterface;
use LeagueCommonMarkExtensionExtensionInterface;

class WarningExtension implements ExtensionInterface
{
    public operate register(EnvironmentBuilderInterface $atmosphere): void
    {
        // ...
    }
}

Now that we have created our primary define for our extension’s class, we have to outline two new issues:

  1. Parser – Right here we’ll learn the Markdown to seek out any blocks that begin with the time period: {warning}.
  2. Renderer – Right here we’ll outline the HTML that needs to be used to switch {warning}.

We’ll begin by defining our parser class:

namespace AppMarkdownExtensions;

use LeagueCommonMarkParserBlockBlockStart;
use LeagueCommonMarkParserBlockBlockStartParserInterface;
use LeagueCommonMarkParserCursor;
use LeagueCommonMarkParserMarkdownParserStateInterface;

class WarningParser implements BlockStartParserInterface
{
    public operate tryStart(Cursor $cursor, MarkdownParserStateInterface $parserState): ?BlockStart
    {
        // Does the block begin with {warning}?
        if (!str_starts_with($cursor->getRemainder(), '{warning}')) {
            return BlockStart::none();
        }

        // The block begins with {warning}, so take away it from the string.
        $warningMessage = str_replace('{warning} ', '', $cursor->getRemainder());

        return BlockStart::of(new WarningBlockParser($warningMessage));
    }
}

Our WarningParser class can be used whereas looping by way of each block within the Markdown. It’s going to examine whether or not the block begins with {warning}. If it does not, we’ll return null (through the BlockStart::none() methodology). If the block does begin with {warning}, we’ll take away it from the string to seek out our warning message. For instance, if our Markdown was {warning} My warning right here, then the warning message could be My warning right here.

We then go the warning message to a WarningBlockParser class, which is then handed to the BlockStart::of() methodology. Our WarningBlockParser class implements the BlockContinueParserInterface, so we have now to implement a number of strategies. Our WarningBlockParser will appear like this:

namespace AppMarkdownExtensions;

use LeagueCommonMarkNodeBlockAbstractBlock;
use LeagueCommonMarkParserBlockBlockContinue;
use LeagueCommonMarkParserBlockBlockContinueParserInterface;
use LeagueCommonMarkParserCursor;

class WarningBlockParser implements BlockContinueParserInterface
{
    non-public Warning $warning;

    public operate __construct(string $warningMessage)
    {
        $this->warning = new Warning($warningMessage);
    }

    public operate getBlock(): AbstractBlock
    {
        return $this->warning;
    }

    public operate isContainer(): bool
    {
        return false;
    }

    public operate canHaveLazyContinuationLines(): bool
    {
        return false;
    }

    public operate canContain(AbstractBlock $childBlock): bool
    {
        return false;
    }

    public operate tryContinue(Cursor $cursor, BlockContinueParserInterface $activeBlockParser): ?BlockContinue
    {
        return BlockContinue::none();
    }

    public operate addLine(string $line): void
    {
        //
    }

    public operate closeBlock(): void
    {
        //
    }
}

The essential a part of this methodology is that we’re returning a Warning class that implements the AbstractBlock interface from the getBlock methodology. Our Warning class will appear like this:

namespace AppMarkdownExtensions;

use LeagueCommonMarkNodeBlockAbstractBlock;

class Warning extends AbstractBlock
{
    public operate __construct(non-public string $warningMessage)
    {
        mother or father::__construct();
    }

    public operate getHtml(): string
    {
        return '<div class="warning">'.$this->warningMessage.'</div>';
    }
}

As you’ll be able to see, we’re returning the HTML within the getHtml methodology. For the aim of this instance, the HTML solely comprises a single <div> with a category of warning, however you would change this to say no matter you’d favor.

Now that we have created our parser and outlined the HTML that needs to be returned, we have to create our renderer class:

namespace AppMarkdownExtensions;

use LeagueCommonMarkNodeNode;
use LeagueCommonMarkRendererChildNodeRendererInterface;
use LeagueCommonMarkRendererNodeRendererInterface;

class WarningRenderer implements NodeRendererInterface
{
    /**
     * @param Warning $node
     *
     * {@inheritDoc}
     */
    public operate render(Node $node, ChildNodeRendererInterface $childRenderer)
    {
        return $node->getHtml();
    }
}

The render methodology within the WarningRenderer class merely calls and returns the getHtml methodology from our Warning class. Therefore, this renderer class will simply return HTML as string.

Now that we have created our parser and renderer courses, we are able to add them to our WarningExtension extension class:

namespace AppMarkdownExtensions;

use LeagueCommonMarkExtensionExtensionInterface;
use LeagueCommonMarkEnvironmentConfigurableEnvironmentInterface;

class WarningExtension implements ExtensionInterface
{
    public operate register(ConfigurableEnvironmentInterface $atmosphere): void
    {
        $atmosphere->addInlineParser(new WarningParser())
            ->addInlineRenderer(new WarningRenderer());
    }
}

Now that we have finalized our extension, we are able to register it in the environment:

use AppMarkdownExtensionsWarningExtension;
use LeagueCommonMarkEnvironmentEnvironment;
use LeagueCommonMarkExtensionCommonMarkCommonMarkCoreExtension;
use LeagueCommonMarkExtensionHeadingPermalinkHeadingPermalinkExtension;
use LeagueCommonMarkExtensionTableOfContentsTableOfContentsExtension;
use LeagueCommonMarkMarkdownConverter;

$atmosphere = new Surroundings();

// Register the core CommonMark parsers and renderers...
$atmosphere->addExtension(new CommonMarkCoreExtension());

// Register our new WarningExtension
$atmosphere->addExtension(new WarningExtension());

$output = (new MarkdownConverter($atmosphere))
    ->convert(file_get_contents(__DIR__.'/markdown/article.md'))
    ->getContent();

Lets say that the article.md file that we handed to the converter contained the next contents:

That is some textual content a couple of security-related situation.

{warning} That is the warning textual content

That is after the warning textual content.

This is able to outcome within the following HTML being output:

That is some textual content a couple of security-related situation.

<div class="warning">That is the warning textual content</div>

That is after the warning textual content.

Conclusion

Hopefully, this text has helped you perceive what Markdown is and its advantages. It also needs to have given you an perception into easy methods to securely use Markdown in your PHP tasks to render HTML utilizing CommonMark PHP, in addition to easy methods to make use of extensions to additional enrich your Markdown.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments