Profiling 444 tables in one job –
This is not a huge number of tables; one job / ruleset to profile them is OK for column level expressions. If using column level profile expressions the job should complete in less than a half hour. As a best practice you should evaluate column names in the schema to make sure they are indicative of the data contained in the column. For example, a column containing first name data named FIRST_NAME meets this requirement. You should always use column level expressions if possible, they are much faster.
If the columns have generic names such as ABC_XXX you will need to use data level expressions which will read the first hundred values of the column by default and match against the regular expression text of the profiler expression being used. In the PROFILER tab of the masking UI the expression level (column or data) appears in the “Level” column. If using many data level expressions the profile job will run considerably longer possibly several hours.
For very large schemas (using data level expressions) it may be beneficial to break up the rulesets used for profiling into smaller chunks. Since you’ll probably want to subset a large schema for masking (to increase parallel execution) you can save a step by splitting the rulesets for your initial profiling run. Each ruleset you create will have its own inventory which will be empty (no domains or algorithms assigned) prior to running the profiler or assigning domains manually.
You can run multiple profile jobs (each with a different profile set) using the same ruleset. The profiler will overlay the domain and algorithm with each execution unless you change the ID method to “USER” in the inventory. If you are manually updating a column’s domain and/or algorithm in the profiler you should always change the ID method to “USER” so subsequent profile executions will not overlay the domain and algorithm you have manually selected. It is also possible that multiple expressions will be matched during profile execution. Care should be take to assure the correct expression is matched and the correct domain/algorithm is applied. Default expressions / profile sets will usually not have multiple matches. It is recommended that the default expressions not be modified, instead create your own version and test it with a Regex test tool.
If your job is suspended you may want to open a support ticket to have the issue cleared. It’s possible that the ruleset may have been corrupted (rare). If the problem is not resolvable via support ticket you may have to delete the ruleset and rebuild it.