Types
istring
: Case-insensitive string
A series of case-insensitive characters. Internally, upper-case
ASCII characters will be converted to lower-case.
text
: Text
A series of characters that may contain newlines. Text tends to
indicate human-oriented text, as opposed to a machine format.
itext
: Case-insensitive text
A series of case-insensitive characters that may contain newlines.
int
: Integer
An
integer.
You are alternatively permitted to pass a string of digits instead,
which will be cast to an integer using
(int)
.
float
: Float
A
floating
point number. You are alternatively permitted to pass a numeric
string (as defined by
is_numeric()
), which will be
cast to a float using
(float)
.
bool
: Boolean
A
boolean.
You are alternatively permitted to pass an integer
0
or
1
(other integers are not permitted) or a string
"on"
,
"true"
or
"1"
for
true
, and
"off"
,
"false"
or
"0"
for
false
.
lookup
: Lookup array
An array whose values are true
, e.g. array('key'
=> true, 'key2' => true)
. You are alternatively
permitted to pass an array list of the keys array('key',
'key2')
or a comma-separated string of keys "key,
key2"
. If you pass an array list of values, ensure that your
values are strictly numerically indexed: array('key1', 2
=> 'key2')
will not do what you expect and emits a
warning.
list
: Array list
An array which has consecutive integer indexes, e.g.
array('val1', 'val2')
. You are alternatively permitted
to pass a comma-separated string of keys "val1, val2"
.
If your array is not in this form, array_values
is run
on the array and a warning is emitted.
hash
: Associative array
An array which is a mapping of keys to values, e.g.
array('key1' => 'val1', 'key2' => 'val2')
. You
are alternatively permitted to pass a comma-separated string of
key-colon-value strings, e.g. "key1: val1, key2:
val2"
.
mixed
: Mixed
An arbitrary PHP value of any type.
Attr
Attr.AllowedClasses
Version added
|
4.0.0
|
Type
|
Lookup array (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/AttrDef/HTML/Class.php on line 33
|
List of allowed class values in the class attribute. By default,
this is null, which means all classes are allowed.
Type
|
Lookup array
|
Default
|
array()
|
Used in
|
-
HTMLPurifier/AttrDef/HTML/FrameTarget.php on line
32
|
Lookup table of all allowed link frame targets. Some commonly used
link targets include _blank, _self, _parent and _top. Values should
be lowercase, as validation will be done in a case-sensitive manner
despite W3C's recommendation. XHTML 1.0 Strict does not permit the
target attribute so this directive will have no effect in that
doctype. XHTML 1.1 does not enable the Target module by default,
you will have to manually enable it (see the module documentation
for more details.)
Attr.AllowedRel
List of allowed forward document relationships in the rel
attribute. Common values may be nofollow or print. By default, this
is empty, meaning that no document relationships are allowed.
Attr.AllowedRev
List of allowed reverse document relationships in the rev
attribute. This attribute is a bit of an edge-case; if you don't
know what it is for, stay away.
Attr.ClassUseCDATA
If null, class will auto-detect the doctype and, if matching XHTML
1.1 or XHTML 2.0, will use the restrictive NMTOKENS specification
of class. Otherwise, it will use a relaxed CDATA definition. If
true, the relaxed CDATA definition is forced; if false, the
NMTOKENS definition is forced. To get behavior of HTML Purifier
prior to 4.0.0, set this directive to false. Some rational behind
the auto-detection: in previous versions of HTML Purifier, it was
assumed that the form of class was NMTOKENS, as specified by the
XHTML Modularization (representing XHTML 1.1 and XHTML 2.0). The
DTDs for HTML 4.01 and XHTML 1.0, however specify class as CDATA.
HTML 5 effectively defines it as CDATA, but with the additional
constraint that each name should be unique (this is not explicitly
outlined in previous specifications).
Attr.DefaultImageAlt
Version added
|
3.2.0
|
Type
|
String (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/AttrTransform/ImgRequired.php on line
33
|
This is the content of the alt tag of an image if the user had not
previously specified an alt attribute. This applies to all images
without a valid alt attribute, as opposed to
%Attr.DefaultInvalidImageAlt,
which only applies to invalid images, and overrides in the case of
an invalid image. Default behavior with null is to use the basename
of the src tag for the alt.
Attr.DefaultInvalidImage
Type
|
String
|
Default
|
''
|
Used in
|
-
HTMLPurifier/AttrTransform/ImgRequired.php on line
27
|
This is the default image an img tag will be pointed to if it does
not have a valid src attribute. In future versions, we may allow
the image tag to be removed completely, but due to design issues,
this is not possible right now.
Attr.DefaultInvalidImageAlt
Type
|
String
|
Default
|
'Invalid image'
|
Used in
|
-
HTMLPurifier/AttrTransform/ImgRequired.php on line
40
|
This is the content of the alt tag of an invalid image if the user
had not previously specified an alt attribute. It has no effect
when the image is valid but there was no alt attribute present.
Attr.DefaultTextDir
Type
|
String
|
Allowed values
|
"ltr", "rtl"
|
Default
|
'ltr'
|
Used in
|
-
HTMLPurifier/AttrTransform/BdoDir.php on line 22
|
Defines the default text direction (ltr or rtl) of the document
being parsed. This generally is the same as the value of the dir
attribute in HTML, or ltr if that is not specified.
Attr.EnableID
Version added
|
1.2.0
|
Type
|
Boolean
|
Default
|
false
|
Aliases
|
HTML.EnableAttrID
|
Used in
|
-
HTMLPurifier/AttrDef/HTML/ID.php on line 41
|
Allows the ID attribute in HTML. This is disabled by default due to
the fact that without proper configuration user input can easily
break the validation of a webpage by specifying an ID that is
already on the surrounding HTML. If you don't mind throwing caution
to the wind, enable this directive, but I strongly recommend you
also consider blacklisting IDs you use (
%Attr.IDBlacklist) or prefixing all user
supplied IDs (
%Attr.IDPrefix). When
set to true HTML Purifier reverts to the behavior of pre-1.2.0
versions.
Attr.ForbiddenClasses
Version added
|
4.0.0
|
Type
|
Lookup array
|
Default
|
array()
|
Used in
|
-
HTMLPurifier/AttrDef/HTML/Class.php on line 34
|
List of forbidden class values in the class attribute. By default,
this is empty, which means that no classes are forbidden. See also
%Attr.AllowedClasses.
Attr.ID.HTML5
Version added
|
4.8.0
|
Type
|
Boolean (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/AttrDef/HTML/ID.php on line 75
|
In HTML5, restrictions on the format of the id attribute have been
significantly relaxed, such that any string is valid so long as it
contains no spaces and is at least one character. In lieu of a
general HTML5 compatibility flag, set this configuration directive
to true to use the relaxed rules.
Attr.IDBlacklist
Type
|
Array list
|
Default
|
array()
|
Used in
|
-
HTMLPurifier/IDAccumulator.php on line 27
|
Array of IDs not allowed in the document.
Attr.IDBlacklistRegexp
Version added
|
1.6.0
|
Type
|
String (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/AttrDef/HTML/ID.php on line 97
|
PCRE regular expression to be matched against all IDs. If the
expression is matches, the ID is rejected. Use this with care: may
cause significant degradation. ID matching is done after all other
validation.
Attr.IDPrefix
Version added
|
1.2.0
|
Type
|
String
|
Default
|
''
|
Used in
|
-
HTMLPurifier/AttrDef/HTML/ID.php on line 51
|
String to prefix to IDs. If you have no idea what IDs your pages
may use, you may opt to simply add a prefix to all user-submitted
ID attributes so that they are still usable, but will not conflict
with core page IDs. Example: setting the directive to 'user_' will
result in a user submitted 'foo' to become 'user_foo' Be sure to
set
%HTML.EnableAttrID to true
before using this.
Attr.IDPrefixLocal
Version added
|
1.2.0
|
Type
|
String
|
Default
|
''
|
Used in
|
-
HTMLPurifier/AttrDef/HTML/ID.php on lines 53, 58
|
Temporary prefix for IDs used in conjunction with
%Attr.IDPrefix. If you need to allow multiple
sets of user content on web page, you may need to have a seperate
prefix that changes with each iteration. This way, seperately
submitted user content displayed on the same page doesn't clobber
each other. Ideal values are unique identifiers for the content it
represents (i.e. the id of the row in the database). Be sure to add
a seperator (like an underscore) at the end. Warning: this
directive will not work unless
%Attr.IDPrefix is set to a non-empty value!
Core
Core.AggressivelyFixLt
Version added
|
2.1.0
|
Type
|
Boolean
|
Default
|
true
|
Used in
|
-
HTMLPurifier/Lexer/DOMLex.php on line 54
|
This directive enables aggressive pre-filter fixes HTML Purifier
can perform in order to ensure that open angled-brackets do not
get killed during parsing stage. Enabling this will result in two
preg_replace_callback calls and at least two preg_replace calls
for every HTML document parsed; if your users make very
well-formed HTML, you can set this directive false. This has no
effect when DirectLex is used.
Notice: This directive's default turned from
false to true in HTML Purifier 3.2.0.
Core.AllowHostnameUnderscore
Version added
|
4.6.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/AttrDef/URI/Host.php on line 77
|
By RFC 1123, underscores are not permitted in host names. (This
is in contrast to the specification for DNS, RFC 2181, which
allows underscores.) However, most browsers do the right thing
when faced with an underscore in the host name, and so some
poorly written websites are written with the expectation this
should work. Setting this parameter to true relaxes our allowed
character check so that underscores are permitted.
Core.CollectErrors
Version added
|
2.0.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier.php on line 162
-
HTMLPurifier/Lexer.php on lines 85, 315
-
HTMLPurifier/Lexer/DirectLex.php on lines 67, 87,
385
-
HTMLPurifier/Strategy/RemoveForeignElements.php on
line 57
|
Whether or not to collect errors found while filtering the
document. This is a useful way to give feedback to your users.
Warning: Currently this feature is very patchy and
experimental, with lots of possible error messages not yet
implemented. It will not cause any problems, but it may not help
your users either.
Core.ColorKeywords
Version added
|
2.0.0
|
Type
|
Associative array
|
Default
|
array (
'maroon' => '#800000',
'red' => '#FF0000',
'orange' => '#FFA500',
'yellow' => '#FFFF00',
'olive' => '#808000',
'purple' => '#800080',
'fuchsia' => '#FF00FF',
'white' => '#FFFFFF',
'lime' => '#00FF00',
'green' => '#008000',
'navy' => '#000080',
'blue' => '#0000FF',
'aqua' => '#00FFFF',
'teal' => '#008080',
'black' => '#000000',
'silver' => '#C0C0C0',
'gray' => '#808080',
)
|
Used in
|
-
HTMLPurifier/AttrDef/CSS/Color.php on line 19
-
HTMLPurifier/AttrDef/HTML/Color.php on line 19
|
Lookup array of color names to six digit hexadecimal number
corresponding to color, with preceding hash mark. Used when parsing
colors. The lookup is done in a case-insensitive manner.
Core.ConvertDocumentToFragment
Type
|
Boolean
|
Default
|
true
|
Aliases
|
Core.AcceptFullDocuments
|
Used in
|
-
HTMLPurifier/Lexer.php on line 313
|
This parameter determines whether or not the filter should convert
input that is a full document with html and body tags to a fragment
of just the contents of a body tag. This parameter is simply
something HTML Purifier can do during an edge-case: for most
inputs, this processing is not necessary.
Core.DirectLexLineNumberSyncInterval
Version added
|
2.0.0
|
Type
|
Integer
|
Default
|
0
|
Used in
|
-
HTMLPurifier/Lexer/DirectLex.php on line 84
|
Specifies the number of tokens the DirectLex line number tracking
implementations should process before attempting to resyncronize
the current line count by manually counting all previous
new-lines. When at 0, this functionality is disabled. Lower
values will decrease performance, and this is only strictly
necessary if the counting algorithm is buggy (in which case you
should report it as a bug). This has no effect when %Core.MaintainLineNumbers is
disabled or DirectLex is not being used.
Core.DisableExcludes
Version added
|
4.5.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/Strategy/FixNesting.php on line 54
|
This directive disables SGML-style exclusions, e.g. the exclusion
of <object>
in any descendant of a
<pre>
tag. Disabling excludes will allow some
invalid documents to pass through HTML Purifier, but HTML
Purifier will also be less likely to accidentally remove large
documents during processing.
Core.EnableIDNA
Version added
|
4.4.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/AttrDef/URI/Host.php on line 105
|
Allows international domain names in URLs. This configuration
option requires the PEAR Net_IDNA2 module to be installed. It
operates by punycoding any internationalized host names for maximum
portability.
Core.Encoding
If for some reason you are unable to convert all webpages to UTF-8,
you can use this directive as a stop-gap compatibility change to
let HTML Purifier deal with non UTF-8 input. This technique has
notable deficiencies: absolutely no characters outside of the
selected character encoding will be preserved, not even the ones
that have been ampersand escaped (this is due to a UTF-8 specific
feature that automatically resolves all entities), making
it pretty useless for anything except the most I18N-blind
applications, although
%Core.EscapeNonASCIICharacters
offers fixes this trouble with another tradeoff. This directive
only accepts ISO-8859-1 if iconv is not enabled.
Core.EscapeInvalidChildren
Warning: this configuration option is no longer
does anything as of 4.6.0.
When true, a child is found that is not allowed in the context of
the parent element will be transformed into text as if it were
ASCII. When false, that element and all internal tags will be
dropped, though text will be preserved. There is no option for
dropping the element but preserving child nodes.
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/Strategy/MakeWellFormed.php on line
72
-
HTMLPurifier/Strategy/RemoveForeignElements.php on
line 26
|
When true, invalid tags will be written back to the document as
plain text. Otherwise, they are silently dropped.
Core.EscapeNonASCIICharacters
Version added
|
1.4.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/Encoder.php on line 423
|
This directive overcomes a deficiency in
%Core.Encoding by blindly converting all
non-ASCII characters into decimal numeric entities before
converting it to its native encoding. This means that even
characters that can be expressed in the non-UTF-8 encoding will be
entity-ized, which can be a real downer for encodings like Big5. It
also assumes that the ASCII repetoire is available, although this
is the case for almost all encodings. Anyway, use UTF-8!
Core.HiddenElements
Type
|
Lookup array
|
Default
|
array (
'script' => true,
'style' => true,
)
|
Used in
|
-
HTMLPurifier/Strategy/RemoveForeignElements.php on
line 36
|
This directive is a lookup array of elements which should have
their contents removed when they are not allowed by the HTML
definition. For example, the contents of a script
tag are not normally shown in a document, so if script tags are
to be removed, their contents should be removed to. This is
opposed to a b
tag, which defines some
presentational changes but does not hide its contents.
Core.Language
Version added
|
2.0.0
|
Type
|
String
|
Default
|
'en'
|
Used in
|
-
HTMLPurifier/LanguageFactory.php on line 93
|
ISO 639 language code for localizable things in HTML Purifier to
use, which is mainly error reporting. There is currently only an
English (en) translation, so this directive is currently useless.
Core.LexerImpl
Version added
|
2.0.0
|
Type
|
Mixed (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/Lexer.php on line 80
|
This parameter determines what lexer implementation can be used.
The valid values are:
-
null
-
Recommended, the lexer implementation will be auto-detected
based on your PHP-version and configuration.
-
string lexer identifier
-
This is a slim way of manually overridding the implementation.
Currently recognized values are: DOMLex (the default PHP5
implementation) and DirectLex (the default PHP4
implementation). Only use this if you know what you are doing:
usually, the auto-detection will manage things for cases you
aren't even aware of.
-
object lexer instance
-
Super-advanced: you can specify your own, custom,
implementation that implements the interface defined by
HTMLPurifier_Lexer
. I may remove this option
simply because I don't expect anyone to use it.
Core.MaintainLineNumbers
Version added
|
2.0.0
|
Type
|
Boolean (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/Lexer.php on line 84
-
HTMLPurifier/Lexer/DirectLex.php on line 62
|
If true, HTML Purifier will add line number information to all
tokens. This is useful when error reporting is turned on, but can
result in significant performance degradation and should not be
used when unnecessary. This directive must be used with the
DirectLex lexer, as the DOMLex lexer does not (yet) support this
functionality. If the value is null, an appropriate value will be
selected based on other configuration.
Core.NormalizeNewlines
Version added
|
4.2.0
|
Type
|
Boolean
|
Default
|
true
|
Used in
|
-
HTMLPurifier/Generator.php on line 122
-
HTMLPurifier/Lexer.php on line 297
|
Whether or not to normalize newlines to the operating system
default. When false
, HTML Purifier will attempt to
preserve mixed newline files.
Core.RemoveInvalidImg
Version added
|
1.3.0
|
Type
|
Boolean
|
Default
|
true
|
Used in
|
-
HTMLPurifier/AttrTransform/ImgRequired.php on line
24
-
HTMLPurifier/Strategy/RemoveForeignElements.php on
line 27
|
This directive enables pre-emptive URI checking in
img
tags, as the attribute validation strategy is
not authorized to remove elements from the document. Revert to
pre-1.3.0 behavior by setting to false.
Core.RemoveProcessingInstructions
Version added
|
4.2.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/Lexer.php on line 334
|
Instead of escaping processing instructions in the form <?
... ?>
, remove it out-right. This may be useful if the
HTML you are validating contains XML processing instruction gunk,
however, it can also be user-unfriendly for people attempting to
post PHP snippets.
Core.RemoveScriptContents
Version added
|
2.0.0
|
Type
|
Boolean (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/Strategy/RemoveForeignElements.php on
line 35
|
Warning: This directive was deprecated in version
2.1.0.
%Core.HiddenElements
should be used instead.
This directive enables HTML Purifier to remove not only script
tags but all of their contents.
HTML
HTML.Allowed
This is a preferred convenience directive that combines %HTML.AllowedElements and %HTML.AllowedAttributes. Specify
elements and attributes that are allowed using:
element1[attr1|attr2],element2...
. For example, if
you would like to only allow paragraphs and links, specify
a[href],p
. You can specify attributes that apply to
all elements using an asterisk, e.g. *[lang]
. You
can also use newlines instead of commas to separate elements.
Warning: All of the constraints on the component
directives are still enforced. The syntax is a subset of
TinyMCE's valid_elements
whitelist: directly
copy-pasting it here will probably result in broken whitelists.
If %HTML.AllowedElements or
%HTML.AllowedAttributes are
set, this directive has no effect.
HTML.AllowedAttributes
Version added
|
1.3.0
|
Type
|
Lookup array (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/HTMLDefinition.php on line 292
|
If HTML Purifier's attribute set is unsatisfactory, overload it!
The syntax is "tag.attr" or "*.attr" for the global attributes
(style, id, class, dir, lang, xml:lang).
Warning: If another directive conflicts with the
elements here, that directive will win and override. For
example, %HTML.EnableAttrID will
take precedence over *.id in this directive. You must set that
directive to true before you can use IDs at all.
Version added
|
4.4.0
|
Type
|
Lookup array
|
Default
|
array()
|
Used in
|
-
HTMLPurifier/Strategy/RemoveForeignElements.php on
line 31
|
A whitelist which indicates what explicit comment bodies should be
allowed, modulo leading and trailing whitespace. See also
%HTML.AllowedCommentsRegexp
(these directives are union'ed together, so a comment is considered
valid if any directive deems it valid.)
Version added
|
4.4.0
|
Type
|
String (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/Strategy/RemoveForeignElements.php on
line 32
|
A regexp, which if it matches the body of a comment, indicates that
it should be allowed. Trailing and leading spaces are removed prior
to running this regular expression.
Warning: Make
sure you specify correct anchor metacharacters
^regex$
, otherwise you may accept comments that you
did not mean to! In particular, the regex
/foo|bar/
is
probably not sufficiently strict, since it also allows
foobar
. See also
%HTML.AllowedComments (these directives
are union'ed together, so a comment is considered valid if any
directive deems it valid.)
HTML.AllowedElements
Version added
|
1.3.0
|
Type
|
Lookup array (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/HTMLDefinition.php on line 291
|
If HTML Purifier's tag set is unsatisfactory for your needs, you
can overload it with your own list of tags to allow. If you
change this, you probably also want to change %HTML.AllowedAttributes; see also
%HTML.Allowed which lets you set
allowed elements and attributes at the same time.
If you attempt to allow an element that HTML Purifier does not
know about, HTML Purifier will raise an error. You will need to
manually tell HTML Purifier about this element by using the
advanced
customization features.
Warning: If another directive conflicts with the
elements here, that directive will win and override.
HTML.AllowedModules
Version added
|
2.0.0
|
Type
|
Lookup array (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/HTMLModuleManager.php on line 241
|
A doctype comes with a set of usual modules to use. Without
having to mucking about with the doctypes, you can quickly
activate or disable these modules by specifying which modules you
wish to allow with this directive. This is most useful for unit
testing specific modules, although end users may find it useful
for their own ends.
If you specify a module that does not exist, the manager will
silently fail to use it, so be careful! User-defined modules are
not affected by this directive. Modules defined in %HTML.CoreModules are not affected by
this directive.
HTML.Attr.Name.UseCDATA
Version added
|
4.0.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/AttrTransform/Name.php on line 18
-
HTMLPurifier/HTMLModule/Name.php on line 19
|
The W3C specification DTD defines the name attribute to be CDATA,
not ID, due to limitations of DTD. In certain documents, this
relaxed behavior is desired, whether it is to specify duplicate
names, or to specify names that would be illegal IDs (for example,
names that begin with a digit.) Set this configuration directive to
true to use the relaxed parsing rules.
HTML.BlockWrapper
Version added
|
1.3.0
|
Type
|
String
|
Default
|
'p'
|
Used in
|
-
HTMLPurifier/HTMLDefinition.php on line 263
|
String name of element to wrap inline elements that are inside a
block context. This only occurs in the children of blockquote in
strict mode.
Example: by default value,
<blockquote>Foo</blockquote>
would
become
<blockquote><p>Foo</p></blockquote>
.
The <p>
tags can be replaced with whatever you
desire, as long as it is a block level element.
HTML.CoreModules
Version added
|
2.0.0
|
Type
|
Lookup array
|
Default
|
array (
'Structure' => true,
'Text' => true,
'Hypertext' => true,
'List' => true,
'NonXMLCommonAttributes' => true,
'XMLCommonAttributes' => true,
'CommonAttributes' => true,
)
|
Used in
|
-
HTMLPurifier/HTMLModuleManager.php on line 242
|
Certain modularized doctypes (XHTML, namely), have certain
modules that must be included for the doctype to be an conforming
document type: put those modules here. By default, XHTML's core
modules are used. You can set this to a blank array to disable
core module protection, but this is not recommended.
HTML.CustomDoctype
Version added
|
2.0.1
|
Type
|
String (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/DoctypeRegistry.php on line 123
|
A custom doctype for power-users who defined their own document
type. This directive only applies when
%HTML.Doctype is blank.
HTML.DefinitionID
Unique identifier for a custom-built HTML definition. If you edit
the raw version of the HTMLDefinition, introducing changes that
the configuration object does not reflect, you must specify this
variable. If you change your custom edits, you should change this
directive, or clear your cache. Example:
$config = HTMLPurifier_Config::createDefault();
$config->set('HTML', 'DefinitionID', '1');
$def = $config->getHTMLDefinition();
$def->addAttribute('a', 'tabindex', 'Number');
In the above example, the configuration is still at the defaults,
but using the advanced API, an extra attribute has been added.
The configuration object normally has no way of knowing that this
change has taken place, so it needs an extra directive: %HTML.DefinitionID. If someone else
attempts to use the default configuration, these two pieces of
code will not clobber each other in the cache, since one has an
extra directive attached to it.
You must specify a value to this directive to use the
advanced API features.
HTML.DefinitionRev
Version added
|
2.0.0
|
Type
|
Integer
|
Default
|
1
|
Revision identifier for your custom definition specified in
%HTML.DefinitionID. This serves
the same purpose: uniquely identifying your custom definition,
but this one does so in a chronological context: revision 3 is
more up-to-date then revision 2. Thus, when this gets
incremented, the cache handling is smart enough to clean up any
older revisions of your definition as well as flush the cache.
HTML.Doctype
Type
|
String (or null)
|
Allowed values
|
"HTML 4.01 Transitional", "HTML 4.01 Strict", "XHTML 1.0
Transitional", "XHTML 1.0 Strict", "XHTML 1.1"
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/DoctypeRegistry.php on line 119
|
Doctype to use during filtering. Technically speaking this is not
actually a doctype (as it does not identify a corresponding DTD),
but we are using this name for sake of simplicity. When non-blank,
this will override any older directives like
%HTML.XHTML or
%HTML.Strict.
HTML.FlashAllowFullScreen
Version added
|
4.2.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/AttrTransform/SafeParam.php on line
53
|
Whether or not to permit embedded Flash content from %HTML.SafeObject to expand to the full
screen. Corresponds to the allowFullScreen
parameter.
HTML.ForbiddenAttributes
Version added
|
3.1.0
|
Type
|
Lookup array
|
Default
|
array()
|
Used in
|
-
HTMLPurifier/HTMLDefinition.php on line 400
|
While this directive is similar to %HTML.AllowedAttributes, for
forwards-compatibility with XML, this attribute has a different
syntax. Instead of tag.attr
, use
tag@attr
. To disallow href
attributes
in a
tags, set this directive to
a@href
. You can also disallow an attribute globally
with attr
or *@attr
(either syntax is
fine; the latter is provided for consistency with %HTML.AllowedAttributes).
Warning: This directive complements %HTML.ForbiddenElements,
accordingly, check out that directive for a discussion of why you
should think twice before using this directive.
HTML.ForbiddenElements
Version added
|
3.1.0
|
Type
|
Lookup array
|
Default
|
array()
|
Used in
|
-
HTMLPurifier/HTMLDefinition.php on line 399
|
This was, perhaps, the most requested feature ever in HTML
Purifier. Please don't abuse it! This is the logical inverse of
%HTML.AllowedElements, and it
will override that directive, or any other directive.
If possible, %HTML.Allowed is
recommended over this directive, because it can sometimes be
difficult to tell whether or not you've forbidden all of the
behavior you would like to disallow. If you forbid
img
with the expectation of preventing images on
your site, you'll be in for a nasty surprise when people start
using the background-image
CSS property.
HTML.MaxImgLength
Version added
|
3.1.1
|
Type
|
Integer (or null)
|
Default
|
1200
|
Used in
|
-
HTMLPurifier/HTMLModule/Image.php on line 21
-
HTMLPurifier/HTMLModule/SafeEmbed.php on line 18
-
HTMLPurifier/HTMLModule/SafeObject.php on line 24
|
This directive controls the maximum number of pixels in the width
and height attributes in img
tags. This is in place
to prevent imagecrash attacks, disable with null at your own
risk. This directive is similar to %CSS.MaxImgLength, and both should be
concurrently edited, although there are subtle differences in the
input format (the HTML max is an integer).
HTML.Nofollow
Version added
|
4.3.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/HTMLModuleManager.php on line 268
|
If enabled, nofollow rel attributes are added to all outgoing
links.
HTML.Parent
Version added
|
1.3.0
|
Type
|
String
|
Default
|
'div'
|
Used in
|
-
HTMLPurifier/HTMLDefinition.php on line 273
|
String name of element that HTML fragment passed to library will
be inserted in. An interesting variation would be using span as
the parent element, meaning that only inline tags would be
allowed.
HTML.Proprietary
Version added
|
3.1.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/HTMLModuleManager.php on line 256
|
Whether or not to allow proprietary elements and attributes in
your documents, as per
HTMLPurifier_HTMLModule_Proprietary
.
Warning: This can cause your documents to stop
validating!
HTML.SafeEmbed
Version added
|
3.1.1
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/HTMLModuleManager.php on line 262
|
Whether or not to permit embed tags in documents, with a number
of extra security features added to prevent script execution.
This is similar to what websites like MySpace do to embed tags.
Embed is a proprietary element and will cause your website to
stop validating; you should see if you can use %Output.FlashCompat with %HTML.SafeObject instead first.
HTML.SafeIframe
Version added
|
4.4.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/HTMLModule/Iframe.php on line 28
-
HTMLPurifier/URIFilter/SafeIframe.php on line 48
|
Whether or not to permit iframe tags in untrusted documents. This
directive must be accompanied by a whitelist of permitted
iframes, such as %URI.SafeIframeRegexp, otherwise it
will fatally error. This directive has no effect on strict
doctypes, as iframes are not valid.
HTML.SafeObject
Version added
|
3.1.1
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/HTMLModuleManager.php on line 259
|
Whether or not to permit object tags in documents, with a number
of extra security features added to prevent script execution.
This is similar to what websites like MySpace do to object tags.
You should also enable %Output.FlashCompat in order to
generate Internet Explorer compatibility code for your object
tags.
HTML.SafeScripting
Version added
|
4.5.0
|
Type
|
Lookup array
|
Default
|
array()
|
Used in
|
-
HTMLPurifier/HTMLModuleManager.php on line 265
-
HTMLPurifier/HTMLModule/SafeScripting.php on line
22
|
Whether or not to permit script tags to external scripts in
documents. Inline scripting is not allowed, and the script must
match an explicit whitelist.
HTML.Strict
Version added
|
1.3.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/DoctypeRegistry.php on line 133
|
Warning: This directive was deprecated in version
1.7.0.
%HTML.Doctype should be used
instead.
Determines whether or not to use Transitional (loose) or Strict
rulesets.
HTML.TargetBlank
Version added
|
4.4.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/HTMLModuleManager.php on line 271
|
If enabled, target=blank
attributes are added to all
outgoing links. (This includes links from an HTTPS version of a
page to an HTTP version.)
HTML.TargetNoreferrer
Version added
|
4.8.0
|
Type
|
Boolean
|
Default
|
true
|
Used in
|
-
HTMLPurifier/HTMLModuleManager.php on line 276
|
If enabled, noreferrer rel attributes are added to links which have
a target attribute associated with them. This prevents malicious
destinations from overwriting the original window.
HTML.TidyAdd
Version added
|
2.0.0
|
Type
|
Lookup array
|
Default
|
array()
|
Used in
|
-
HTMLPurifier/HTMLModule/Tidy.php on line 54
|
Fixes to add to the default set of Tidy fixes as per your level.
HTML.TidyLevel
Version added
|
2.0.0
|
Type
|
String
|
Allowed values
|
"none", "light", "medium", "heavy"
|
Default
|
'medium'
|
Used in
|
-
HTMLPurifier/HTMLModule/Tidy.php on line 50
|
General level of cleanliness the Tidy module should enforce.
There are four allowed values:
-
none
-
No extra tidying should be done
-
light
-
Only fix elements that would be discarded otherwise due to lack
of support in doctype
-
medium
-
Enforce best practices
-
heavy
-
Transform all deprecated elements and attributes to standards
compliant equivalents
HTML.TidyRemove
Version added
|
2.0.0
|
Type
|
Lookup array
|
Default
|
array()
|
Used in
|
-
HTMLPurifier/HTMLModule/Tidy.php on line 55
|
Fixes to remove from the default set of Tidy fixes as per your
level.
HTML.Trusted
Version added
|
2.0.0
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/HTMLModuleManager.php on line 234
-
HTMLPurifier/Lexer.php on line 302
-
HTMLPurifier/HTMLModule/Image.php on line 37
-
HTMLPurifier/Lexer/DirectLex.php on line 47
-
HTMLPurifier/Strategy/RemoveForeignElements.php on
line 30
|
Indicates whether or not the user input is trusted or not. If the
input is trusted, a more expansive set of allowed tags and
attributes will be used. See also
%CSS.Trusted.
HTML.XHTML
Version added
|
1.1.0
|
Type
|
Boolean
|
Default
|
true
|
Aliases
|
Core.XHTML
|
Used in
|
-
HTMLPurifier/DoctypeRegistry.php on line 128
|
Warning: This directive was deprecated in version
1.7.0.
%HTML.Doctype should be used
instead.
Determines whether or not output is XHTML 1.0 or HTML 4.01 flavor.
URI
URI.AllowedSchemes
Type
|
Lookup array
|
Default
|
array (
'http' => true,
'https' => true,
'mailto' => true,
'ftp' => true,
'nntp' => true,
'news' => true,
'tel' => true,
)
|
Used in
|
-
HTMLPurifier/URISchemeRegistry.php on line 48
|
Whitelist that defines the schemes that a URI is allowed to have.
This prevents XSS attacks from using pseudo-schemes like javascript
or mocha. There is also support for the data
and
file
URI schemes, but they are not enabled by default.
URI.Base
Version added
|
2.1.0
|
Type
|
String (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/URIDefinition.php on line 77
|
The base URI is the URI of the document this purified HTML will
be inserted into. This information is important if HTML Purifier
needs to calculate absolute URIs from relative URIs, such as when
%URI.MakeAbsolute is on. You may
use a non-absolute URI for this value, but behavior may vary
(%URI.MakeAbsolute deals nicely
with both absolute and relative paths, but forwards-compatibility
is not guaranteed). Warning: If set, the scheme
on this URI overrides the one specified by %URI.DefaultScheme.
URI.DefaultScheme
Type
|
String (or null)
|
Default
|
'http'
|
Used in
|
-
HTMLPurifier/URIDefinition.php on line 84
|
Defines through what scheme the output will be served, in order
to select the proper object validator when no scheme information
is present.
Starting with HTML Purifier 4.9.0, the default scheme can be
null, in which case we reject all URIs which do not have explicit
schemes.
URI.DefinitionID
Unique identifier for a custom-built URI definition. If you want
to add custom URIFilters, you must specify this value.
URI.DefinitionRev
Version added
|
2.1.0
|
Type
|
Integer
|
Default
|
1
|
URI.Disable
Version added
|
1.3.0
|
Type
|
Boolean
|
Default
|
false
|
Aliases
|
Attr.DisableURI
|
Used in
|
-
HTMLPurifier/AttrDef/URI.php on line 47
|
Disables all URIs in all forms. Not sure why you'd want to do
that (after all, the Internet's founded on the notion of a
hyperlink).
URI.DisableExternal
Version added
|
1.2.0
|
Type
|
Boolean
|
Default
|
false
|
Disables links to external websites. This is a highly effective
anti-spam and anti-pagerank-leech measure, but comes at a hefty
price: nolinks or images outside of your domain will be allowed.
Non-linkified URIs will still be preserved. If you want to be able
to link to subdomains or use absolute URIs, specify
%URI.Host for your website.
URI.DisableExternalResources
Version added
|
1.3.0
|
Type
|
Boolean
|
Default
|
false
|
Disables the embedding of external resources, preventing users from
embedding things like images from other hosts. This prevents access
tracking (good for email viewers), bandwidth leeching, cross-site
request forging, goatse.cx posting, and other nasties, but also
results in a loss of end-user functionality (they can't directly
post a pic they posted from Flickr anymore). Use it if you don't
have a robust user-content moderation team.
URI.DisableResources
Version added
|
4.2.0
|
Type
|
Boolean
|
Default
|
false
|
Disables embedding resources, essentially meaning no pictures.
You can still link to them though. See %URI.DisableExternalResources
for why this might be a good idea.
Note: While this directive has been available since
1.3.0, it didn't actually start doing anything until 4.2.0.
URI.Host
Version added
|
1.2.0
|
Type
|
String (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/URIDefinition.php on line 76
-
HTMLPurifier/URIScheme.php on line 89
|
Defines the domain name of the server, so we can determine
whether or an absolute URI is from your website or not. Not
strictly necessary, as users should be using relative URIs to
reference resources on your website. It will, however, let you
use absolute URIs to link to subdomains of the domain you post
here: i.e. example.com will allow sub.example.com. However,
higher up domains will still be excluded: if you set %URI.Host to sub.example.com, example.com will be
blocked. Note: This directive overrides %URI.Base because a given page may be on a
sub-domain, but you wish HTML Purifier to be more relaxed and
allow some of the parent domains too.
URI.HostBlacklist
Version added
|
1.3.0
|
Type
|
Array list
|
Default
|
array()
|
Used in
|
-
HTMLPurifier/URIFilter/HostBlacklist.php on line
25
|
List of strings that are forbidden in the host of any URI. Use it
to kill domain names of spam, etc. Note that it will catch anything
in the domain, so moo.com will catch
moo.com.example.com.
URI.MakeAbsolute
Version added
|
2.1.0
|
Type
|
Boolean
|
Default
|
false
|
Converts all URIs into absolute forms. This is useful when the
HTML being filtered assumes a specific base path, but will
actually be viewed in a different context (and setting an
alternate base URI is not possible). %URI.Base must be set for this directive to work.
URI.Munge
Munges all browsable (usually http, https and ftp) absolute URIs
into another URI, usually a URI redirection service. This
directive accepts a URI, formatted with a %s
where
the url-encoded original URI should be inserted (sample:
http://www.google.com/url?q=%s
).
Uses for this directive:
- Prevent PageRank leaks, while being fairly transparent to
users (you may also want to add some client side JavaScript to
override the text in the statusbar). Notice:
Many security experts believe that this form of protection does
not deter spam-bots.
- Redirect users to a splash page telling them they are leaving
your website. While this is poor usability practice, it is often
mandated in corporate environments.
Prior to HTML Purifier 3.1.1, this directive also enabled the
munging of browsable external resources, which could break things
if your redirection script was a splash page or used
meta
tags. To revert to previous behavior, please
use %URI.MungeResources.
You may want to also use %URI.MungeSecretKey along with this
directive in order to enforce what URIs your redirector script
allows. Open redirector scripts can be a security risk and
negatively affect the reputation of your domain name.
Starting with HTML Purifier 3.1.1, there is also these
substitutions:
Key
|
Description
|
Example <a href="">
|
%r
|
1 - The URI embeds a resource
(blank) - The URI is merely a link
|
|
%n
|
The name of the tag this URI came from
|
a
|
%m
|
The name of the attribute this URI came from
|
href
|
%p
|
The name of the CSS property this URI came from, or blank
if irrelevant
|
|
Admittedly, these letters are somewhat arbitrary; the only
stipulation was that they couldn't be a through f. r is for
resource (I would have preferred e, but you take what you can
get), n is for name, m was picked because it came after n (and I
couldn't use a), p is for property.
URI.MungeResources
Version added
|
3.1.1
|
Type
|
Boolean
|
Default
|
false
|
Used in
|
-
HTMLPurifier/URIFilter/Munge.php on line 48
|
If true, any URI munging directives like %URI.Munge will also apply to embedded
resources, such as <img src="">
. Be careful
enabling this directive if you have a redirector script that does
not use the Location
HTTP header; all of your images
and other embedded resources will break.
Warning: It is strongly advised you use this in
conjunction %URI.MungeSecretKey
to mitigate the security risk of an open redirector.
URI.MungeSecretKey
Version added
|
3.1.1
|
Type
|
String (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/URIFilter/Munge.php on line 49
|
This directive enables secure checksum generation along with
%URI.Munge. It should be set to a secure
key that is not shared with anyone else. The checksum can be
placed in the URI using %t. Use of this checksum affords an
additional level of protection by allowing a redirector to check
if a URI has passed through HTML Purifier with this line:
$checksum === hash_hmac("sha256", $url, $secret_key)
If the output is TRUE, the redirector script should accept the
URI.
Please note that it would still be possible for an attacker to
procure secure hashes en-mass by abusing your website's Preview
feature or the like, but this service affords an additional level
of protection that should be combined with website blacklisting.
Remember this has no effect if %URI.Munge is not on.
URI.OverrideAllowedSchemes
Type
|
Boolean
|
Default
|
true
|
Used in
|
-
HTMLPurifier/URISchemeRegistry.php on line 49
|
If this is set to true (which it is by default), you can override
%URI.AllowedSchemes by simply
registering a HTMLPurifier_URIScheme to the registry. If false, you
will also have to update that directive in order to add more
schemes.
URI.SafeIframeRegexp
Version added
|
4.4.0
|
Type
|
String (or null)
|
Default
|
NULL
|
Used in
|
-
HTMLPurifier/URIFilter/SafeIframe.php on line 35
|
A PCRE regular expression that will be matched against an iframe
URI. This is a relatively inflexible scheme, but works well
enough for the most common use-case of iframes: embedded video.
This directive only has an effect if %HTML.SafeIframe is enabled. Here are some
example values:
-
%^http://www.youtube.com/embed/%
- Allow YouTube
videos
-
%^http://player.vimeo.com/video/%
- Allow Vimeo
videos
-
%^http://(www.youtube.com/embed/|player.vimeo.com/video/)%
- Allow both
Note that this directive does not give you enough granularity to,
say, disable all autoplay
videos. Pipe up on the
HTML Purifier forums if this is a capability you want.