Creating Custom DLP Classification Rules and Policy


When at first I was looking into this the TechNet documentation was extensive and yet not as specific as I would prefer, so here is the quick and dirty DLP classification!

Creating and importing custom Classifications

  1. First you need to create your custom policy XML (Example Below)
  2. Save as XML Unicode file type (C:\MyNewPolicy.xml)
  3. Open the XML in internet explorer if its formatted correctly you will see the XML.
  4. Then import with Powershell
    New-ClassificationRuleCollection –FileData ([Byte[]]$(Get-Content -path C:\MyNewPolicy.xml -Encoding byte -ReadCount 0))
  5. Once its imported you should be able to create a new DLP policy using the EAC

Creating a custom DLP Rule

  1. Login to EAC (i.e https://mail.domain.com/ecp)
  2. Click Compliance Management, data loss prevention
  3. Click the Plusimage , then New custom policy
    image
  4. Name your policy and Choose your mode (I like to test with Policy tags), and click Save
    image
  5. Select the policy and click the image edit your new policy
  6. Select Rules from the left
  7. Click the imageto Create a new rule
  8. On the Apply this rule if field choose The message contains Sensitive information..
  9. Click *Select sensitive information types….. (if applicable)
  10. Click the imageto choose from the list,
  11. You should now see your new classification (from the example below it would be Secure Product Codes\ DLP by Exchangemasters.info)

image

Useful Tools

Example of a Rule Classification XML

 <?xml version=”1.0″ encoding=”utf-16″?>

 <RulePackage xmlns=”http://schemas.microsoft.com/office/2011/mce”&gt;

 <RulePack id=”b4b4c60e-2ff7-47b2-a672-86e36cf608be”>

  <Version major=”1″ minor=”0″ build=”0″ revision=”0″/>

  <Publisher id=”7ea13c35-0e58-472a-b864-5f2e717edec6″/>

  <Details defaultLangCode=”en-us”>

  <LocalizedDetails langcode=”en-us”>

  <PublisherName>DLP by Exchangemasters.info</PublisherName>

  <Name>Secure Product Codes</Name>

  <Description>Secure Products</Description>

  </LocalizedDetails>

  </Details>

  </RulePack>

  <Rules>

  <!– Product Code –>

  <Entity id=”acc59528-ff01-433e-aeee-13ca8aaee159″ patternsProximity=”300″ recommendedConfidence=”75″>

 <Pattern confidenceLevel=”75″>

  <IdMatch idRef=”Regex_Product_Code” />

  <Match idRef=”Code” />

  </Pattern>

  </Entity>

  <Regex id=”Regex_Product_Code”>[A-Z]{3}[0-9]{9}

  </Regex>

  <Keyword id=”Code”>

  <Group matchStyle=”word”>

  <Term>Code</Term>

  </Group>

  </Keyword>

  <LocalizedStrings>

  <Resource idRef=”acc59528-ff01-433e-aeee-13ca8aaee159″>

  <Name default=”true” langcode=”en-us”>

  Product Code

  </Name>

  <Description default=”true” langcode=”en-us”>

 A custom classification for detecting product codes that have 3 uppercase letters and 9 numbers

 </Description>

 </Resource>

</LocalizedStrings>

</Rules>

</RulePackage>

18 thoughts on “Creating Custom DLP Classification Rules and Policy

  1. Hi there

    I’m trying to add new sensitive information types “New-ClassificationRuleCollection –FileData ([Byte[]]$(Get-Content -path C:\MyNewPolicy.xml -Encoding byte -ReadCount 0))” into Exchange Server 2013 using Virtual Machine(VM) and follow your steps clearly.
    I follow all your steps but i faced an error when trying to input the Exchange Power shell command, it displays with this error.

    “Unable to continue processing classification rule collection payload for decryption or further validations. Payload may contain invalid data”

    Perhaps you can help me to solve this error.

    Thank you.

    • That error indicates an issues with something that is in the xml , sometimes when you copy from a Web page you may need to re-type the quotes.

      Or there is another syntax issues in the xml

  2. Thank you for your response. I really appreciate it. 🙂

    I’ve been given a project to create a data loss prevention service within 2 week and i’m kind of new to this. When creating the XML file, the Rule Pack Id, Publisher Id and the entity Id, can i just copy the same from your example? Because i even re-type the quotes and still i face with the same error 😦 Maybe you can quote me some good examples to create the XML schema to add new sensitive information types.

    Thank you so much

      • Thanks a lot man for your reply, i really appreciate it. I manage to add the sensitive information types already and able to generate the rule pack and publisher id through your link. According to my error, I realized that sometimes i need to generate the GUID more as it does not really guarantee the uniqueness. Also, i test my XML with an XML validator to ensure there’s no error.

    • Hussein, did you save the file as Unicode .XML?
      if not that could also be the issue, I also noticed in the post some weird characters were added to the <RulePackage xmlns=” line, I will fix that on the post now.

  3. I’m trying to remove the sensitive information types that i have just added after i had follow your steps that you’ve posted. Is it possible?

  4. Thank you :):):) I’m now able to delete the rule. May i know how if you able to modify the rule as what you have said earlier? 🙂

  5. Thanks for being one of the few sources on the net with an example of how to make a custom classification map. Do you happen to know off hand how to create one with a simple keyword-based match list? Would I need to set a confidence level and proximity for a word or string keyword-based match? I’ve tried working off your template and replacing it with a keyword-based match (as show in TechNet documentation for matching methods: http://technet.microsoft.com/en-us/library/jj674702(v=exchg.150).aspx) , but my XML keeps failing when I try to import it.

  6. absolutely!
    The example above uses keyword as well as the regex, here is the section you want to look at:

    Code

    just add additional term lines to match keywords
    Word2
    word3
    and you can remove the regex if you don’t want to use that.

  7. Awesome post, I was trying to create a custom policy and classification that would trigger when even a single SSN was detected. I have a customer who asked me if they could create a policy to detect a ssn with no other info like name etc…. When I tried to combine your guidance with the info from TechNet I get an error that I can’t seem to find anything about and was curious if you had seen before: Invalid text processor reference(s) detected: “FormattedSSN”. The referenced text processors don’t exist, or it is Fingerprint text processor but referenced by IdMatch or old version of classification rule.
    Have you created any classification to tweak SSN before? I am wondering if I am trying to do something that’s not supported. I am including my XML below so you can see what is triggering the error.

    DLP by the Cloud Master
    Custom SSN Classification
    Custom SSN Classification

    Social Security Number

    A custom classification for detecting Social Security numbers

  8. Hi Jedi,

    This was the only article that i found with create custom DLP classification rules. I have successfully imported several Classification rule sets.
    have a question on how to include several regx values as conditions in a single XML?
    example like, we need check employee id (8Digits) as
    xxxx-xxx-x
    xxxx.xxx.x
    xxxx xxx x

    any idea how we can include these in one XML as we are checking the same employee id but different patterns?

    Many Thanks

    • Sorry for the very late reply.
      it should work if you simply add more Rules within
      ensure they all have unique IDs

      actually for that you can probably just add to the regex string (I would recommend using a regex tool like regexr to figure out what it should look like)

Leave a comment