-
Notifications
You must be signed in to change notification settings - Fork 135
Open
Description
When passing a string with HTML Entities, the convert method will decode these and return them as-is.
This can lead to some unintentional behaviour if you have used htmlentities to encode text with legitimate html code converted to "string" / HTML Entities.
See this test:
use PHPUnit\Framework\TestCase;
use Soundasleep\Html2Text;
class Html2TextTest extends TestCase
{
public function testKeepsHtmlEntities()
{
$html = '<p>Test <b>bold</b> <script>alert("nope!")</script> <script>alert(\'text\')</script> <i>italic</i> <u>underline</u> <s>strikethrough</s> <a href="http://example.com">link</a></p>';
$expected = "Test bold <script>alert("example!")</script> italic underline strikethrough [link](http://example.com)";
$this->assertEquals($expected, Html2Text::convert($html));
}
}Output:
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'Test bold <script>alert("example!")</script> italic underline strikethrough [link](http://example.com)'
+'Test bold <script>alert("nope!")</script> italic underline strikethrough [link](http://example.com)'
I believe this is unintentional behaviour, because we end up with raw HTML code - the exact opposite of the intention when calling convert.
Metadata
Metadata
Assignees
Labels
No labels