CN103885964A - Content checking method and system - Google Patents

Content checking method and system Download PDF

Info

Publication number
CN103885964A
CN103885964A CN201210559036.4A CN201210559036A CN103885964A CN 103885964 A CN103885964 A CN 103885964A CN 201210559036 A CN201210559036 A CN 201210559036A CN 103885964 A CN103885964 A CN 103885964A
Authority
CN
China
Prior art keywords
examination
verification
hash
record
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210559036.4A
Other languages
Chinese (zh)
Other versions
CN103885964B (en
Inventor
石海涛
杨刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Feinno Communication Technology Co Ltd
Original Assignee
Beijing Feinno Communication Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Feinno Communication Technology Co Ltd filed Critical Beijing Feinno Communication Technology Co Ltd
Priority to CN201210559036.4A priority Critical patent/CN103885964B/en
Publication of CN103885964A publication Critical patent/CN103885964A/en
Application granted granted Critical
Publication of CN103885964B publication Critical patent/CN103885964B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Abstract

The invention discloses a content checking method and system which is used for using a hash checking list to check contents issued by a user. Each storage item of the hash checking list comprises a hash valve and checking parameters. The method includes the steps of A, reading data issued by the user, selecting valid contents in the data issued by the user, and using a hash digest algorithm to calculate the hash digest value of the valid contents; B, judging whether a storage item whose hash valve is equal to the hash digest value of the valid contents exists in the hash checking list or not, if so, executing step C, and if not, executing step D; C, using the checking parameters in the storage item as the checking results of the data issued by the user; D, using a sensitive word library to check the valid contents to obtain the checking results of the data issued by the user.

Description

A kind of content auditing method and system
Technical field
The present invention relates to computing machine and the communications field, particularly a kind of content auditing method and system.
Background technology
The user of user's application content auditing system access is at present content distributed, and data volume is monthly all at rapid growth, and data volume presents following features:
Although the content type of access is a lot, mainly concentrates on limited several types, can account for 90% of total amount; Identical content forwards frequent, and transfer amount is very large.
The basic auditing flow of existing content auditing system is as follows:
First a set of responsive dictionary of manual maintenance, and come into force in real time; After content access, carries out filtering sensitive words, hit wherein one and carry out manual examination and verification, if so just do not hit and pass through.
This flow process is reached the standard grade at first in system, and user is content distributed when not being a lot of, no problem.But in the time that content quantity explodes, manual examination and verification often have overstocked, and in the time of examination & verification, often need to audit the identical data that repeat more, cause manpower waste.Therefore, needs are a kind of can be for the content repeating, and real-time update, the dynamic examination & verification benchmark of adjusting, to promote the content auditing method and system of automatic examination & verification ratio.
Summary of the invention
The invention provides a kind of content auditing method and system, to reach real-time update, dynamically to adjust examination & verification benchmark, promote the effect of automatic examination & verification ratio.For achieving the above object, the present invention adopts following technical scheme:
The invention discloses a kind of content auditing method, the content that uses Hash examination & verification list examination & verification user to issue, each Storage Item of Hash examination & verification list comprises cryptographic hash and examination & verification parameter, the method comprises:
A, read the data that user issues, the data of issuing from described user, choose effective content, use Hash digest algorithm to calculate the Hash digest value of described effective content;
B, judge in described Hash examination & verification list whether have a Storage Item, the cryptographic hash that this Storage Item comprises equals the Hash digest value of described effective content, if be judged as YES, performs step C, if be judged as NO, performs step D;
The auditing result of the data that C, examination & verification parameter among this Storage Item are issued as described user;
D, use the described effective content of responsive dictionary examination & verification, if the sensitive word among the miss responsive dictionary of described effective content obtains the auditing result of the data that described user issues for passing through, the data qualifier of user's issue; If described effective content is hit the sensitive word among responsive dictionary, receiving management people's examination & verification instruction, audits described effective content of hitting sensitive word according to described examination & verification instruction, obtains the auditing result of the data of described user's issue.
Wherein, describedly according to described examination & verification instruction, described effective content of hitting sensitive word is audited, after obtaining the auditing result of data of described user's issue, the method further comprises step e: upgrade the record in record sheet according to described auditing result, wherein each record in this record sheet comprises cryptographic hash, examination & verification parameter and examination & verification number of times; Whether the examination & verification number of times that judges the record after described renewal reaches max-thresholds, if be judged as YES, in described record sheet, delete this record, and the cryptographic hash in this record and examination & verification parameter are moved in described Hash examination & verification list as Storage Item, realize the renewal of described Hash examination & verification list; Wherein, the cryptographic hash of described each record is hit the Hash digest value of effective content of sensitive word described in being, the auditing result of the data that the user corresponding to effective content of sensitive word issue is hit in examination & verification described in Parametric Representation, and examination & verification number of times is the number of times that hits the auditing result of the data that the user corresponding to effective content of sensitive word issue described in obtaining.
Wherein, record in described renewal record sheet, specifically comprises: judge in record sheet whether have a record, hit the Hash digest value of effective content of sensitive word described in the cryptographic hash that this record comprises equals, revise this record if be judged as YES, if be judged as otherwise a newly-increased record.
Wherein when judging while existing it cryptographic hash comprising to hit the recording of Hash digest value of effective content of sensitive word described in equaling in record sheet, this record of described amendment, specifically comprise: judge whether the auditing result of the data of described user's issue equals the examination & verification parameter of this record, if be judged as YES the examination & verification number of times of this record increased to 1, if be judged as NO, examination & verification number of times is reduced to 1, if examination & verification number of times is less than default minimum value, delete this record.
Wherein, each Storage Item of described Hash examination & verification list also comprises effective time parameter, and this, parameter can be successively decreased in time effective time; Step C further comprises: the maximum effective time that the effective time among this Storage Item, parameter was set to preset; The method further comprises: in the time that the effective time of a Storage Item, parameter was decremented to 0 in time, delete this Storage Item.
Wherein, the described data of issuing from user, choose effective content, specifically comprise: the data that user is issued are cut word analysis, and filtering does not have influential punctuation mark and character to context.
The invention also discloses a kind of content auditing system, the content that uses Hash examination & verification list examination & verification user to issue, each Storage Item of Hash examination & verification list comprises cryptographic hash and examination & verification parameter, this system comprises: data read analytic unit, Hash examination & verification unit and content auditing unit, data read analytic unit, the data of issuing for reading user, choose effective content the data of issuing from described user, use Hash digest algorithm to calculate the Hash digest value of described effective content; Hash examination & verification unit, be used for judging whether described Hash examination & verification list exists a Storage Item, the cryptographic hash that this Storage Item comprises equals the Hash digest value of described effective content, if and be judged as YES the auditing result of the data that the examination & verification parameter among this Storage Item is issued as the described user of examination & verification; Content auditing unit, for in the time that Hash examination & verification unit judges Hash examination & verification list does not exist cryptographic hash to equal the Storage Item of Hash digest value of effective content, use responsive dictionary to audit effective content, if the sensitive word among described effective content is miss responsive dictionary, obtain the auditing result of the data that described user issues for passing through, the data qualifier of user's issue; If described effective content is hit the sensitive word among responsive dictionary, receiving management people's examination & verification instruction, audits described effective content of hitting sensitive word according to described examination & verification instruction, obtains the auditing result of the data of described user's issue.
Wherein, this system also comprises Hash examination & verification list processing unit, for described effective content of hitting sensitive word being audited according to described examination & verification instruction when content auditing unit, after obtaining the auditing result of data of described user's issue, upgrade the record in record sheet according to described auditing result, wherein each record in this record sheet comprises cryptographic hash, examination & verification parameter and examination & verification number of times; Whether the examination & verification number of times that judges the record after described renewal reaches max-thresholds, if be judged as YES, in described record sheet, delete this record, and the cryptographic hash in this record and examination & verification parameter are moved in described Hash examination & verification list as Storage Item, realize the renewal of described Hash examination & verification list; Wherein, the cryptographic hash of described each record is hit the Hash digest value of effective content of sensitive word described in being, the auditing result of the data that the user corresponding to effective content of sensitive word issue is hit in examination & verification described in Parametric Representation, and examination & verification number of times is the number of times that hits the auditing result of the data that the user corresponding to effective content of sensitive word issue described in obtaining.
Wherein, Hash examination & verification list processing unit specifically for: when judging while existing it cryptographic hash comprising to hit the recording of Hash digest value of effective content of sensitive word described in equaling in record sheet, judge whether the auditing result of the data of described user's issue equals the examination & verification parameter of this record, if be judged as YES, the examination & verification number of times of this record is increased to 1, if be judged as NO, examination & verification number of times is reduced to 1, if examination & verification number of times is less than default minimum value, delete this record.
Wherein, each Storage Item of Hash examination & verification list also comprises effective time parameter, and this, parameter can be successively decreased in time effective time; Hash examination & verification unit is further used for: when judge that Hash audits while existing cryptographic hash to equal the Storage Item of Hash digest value of effective content in list, the maximum effective time that the effective time among this Storage Item, parameter was set to preset; Hash examination & verification list processing unit is further used for: in the time that the effective time of a Storage Item, parameter was decremented to 0 in time, delete this Storage Item; Data read analytic unit, cut word analysis specifically for the data that user is issued, and filtering does not have influential punctuation mark and character to choose effective content to context.
The beneficial effect of the embodiment of the present invention is: by choosing effective content, remove the character without context, make Hash operation more accurate; By Hash examination & verification list being set and dynamically revising Hash examination & verification list, promote the ratio of examination & verification automatically, reduce manpower consumption.
Brief description of the drawings
The process flow diagram of a kind of content auditing method that Fig. 1 provides for preferred embodiment of the present invention;
In a kind of content auditing method that Fig. 2 provides for preferred embodiment of the present invention, upgrade the detailed process flow diagram of Hash examination & verification list;
The block diagram of a kind of content auditing system that Fig. 3 provides for preferred embodiment of the present invention;
Fig. 4 is the application schematic diagram of content auditing system of the present invention.
Embodiment
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing, embodiment of the present invention is described in further detail.
The process flow diagram of a kind of content auditing method that Fig. 1 provides for preferred embodiment of the present invention.The content that the method is used Hash (hash) examination & verification list examination & verification user to issue, each Storage Item of Hash examination & verification list comprises cryptographic hash and examination & verification parameter, method comprises the following steps:
S100: read the data that user issues, choose effective content the data of issuing from user, use Hash digest algorithm to calculate the Hash digest value of effective content.
S200: judge in Hash examination & verification list whether have a Storage Item, the cryptographic hash that this Storage Item comprises equals the Hash digest value of effective content, if be judged as YES, performs step S300, if be judged as NO, performs step S410.
S300: the result of the data that the examination & verification parameter among this Storage Item is issued as examination & verification user.
S410: judge whether effective content hits the sensitive word in responsive dictionary, if be judged as NO, obtain the auditing result of the data that described user issues for passing through, the data qualifier of user's issue also exits flow process; If be judged as YES, perform step S420.
S420: receiving management people's examination & verification instruction, according to described examination & verification instruction, described effective content of hitting sensitive word is audited, obtain the auditing result of the data of described user's issue.
Wherein, among step S100, the data of issuing, choose effective content from user, the data of specifically user being issued are cut word analysis, and filtering does not have influential punctuation mark and character to context, thereby makes Hash operation more accurate.And extract effective content and calculate Hash digest value, can improve the accuracy rate of coupling.For example, punctuation mark, space, do not have words such as influential " eh, " to be removed to literary composition meaning, now Hash operation is more accurate, but the different data of the identical form of implication content also can be audited list examination & verification by Hash, have strengthened Hash review scope.
Compared with prior art, the present invention uses Hash examination & verification list, is automatically audited the manual examination and verification that replaced part by system, has reduced the consumption of manpower.
In a kind of content auditing method that Fig. 2 provides for preferred embodiment of the present invention, upgrade the detailed process flow diagram of Hash examination & verification list.Use record sheet to converge, judge processing to the effective content and the auditing result thereof that obtain in step S420, finally generate and upgrade Hash examination & verification list, be specifically included in the following steps after step S420 as shown in Figure 1:
S510: judge whether there is a record in record sheet, described in equaling, the cryptographic hash that this record comprises hits the Hash digest value of effective content of sensitive word, if be judged as NO, perform step S520, if be judged as YES, perform step S530, wherein each record of this record sheet comprises cryptographic hash, examination & verification parameter and examination & verification number of times.
S520: a newly-increased record, described in equaling, the cryptographic hash of this newly-increased record hits the Hash digest value of effective content of sensitive word, and examination & verification parameter is hit the auditing result of the data that the user corresponding to effective content of sensitive word issue described in being.
S530: judge whether the auditing result of the data of described user's issue equals the examination & verification parameter of this record, if be judged as YES, performs step S540, if be judged as NO, performs step S550.
S540: the examination & verification number of times of this record is increased to 1, and perform step S560.
S550: examination & verification number of times is reduced to 1, if examination & verification number of times is less than default minimum value, delete this record.
S560: whether the examination & verification number of times that judges the record after upgrading reaches max-thresholds, if be judged as YES, performs step S570, if be judged as NO, exits flow process.
S570: delete this record in described record sheet, and the cryptographic hash in this record and examination & verification parameter are moved in described Hash examination & verification list as Storage Item, thereby realize the renewal of described Hash examination & verification list.
Wherein, the cryptographic hash of described each record is hit the Hash digest value of effective content of sensitive word described in being, the auditing result of the data that the user corresponding to effective content of sensitive word issue is hit in examination & verification described in Parametric Representation, and examination & verification number of times is the number of times that hits the auditing result of the data that the user corresponding to effective content of sensitive word issue described in obtaining.
By at step S510-S570, can generate and upgrade in real time the Hash examination & verification list of buffer memory.In actual examination & verification, the content that user issues often and time correlation, some data within a period of time (for example Hot Contents) can by many user's repeatability issue, the content that in different time sections, user issues is often different, therefore real-time update Hash examination & verification list, can improve the probability of automatic examination & verification.
In the present embodiment, each Storage Item of Hash examination & verification list also comprises effective time parameter, and this, parameter can be successively decreased in time effective time; Step S300 further comprises: the maximum effective time that the effective time among this Storage Item, parameter was set to preset; In addition, also can judge effective time, whether parameter was 0, when a Storage Item of Hash examination & verification list effective time, parameter was decremented to 0 in time time, delete this Storage Item.In the present embodiment, by being set, maximum parameter effective time judges whether to delete Storage Item.In the time that the Hash digest value of the effective content receiving within a period of time equals cryptographic hash that a Storage Item of Hash examination & verification list comprises, effective time, parameter was set to maximum parameter effective time, further, in the time that the time that the cryptographic hash of a certain Storage Item is not mated continuously reaches maximum parameter effective time, delete Storage Item, Storage Item low frequency of utilization can be deleted from Hash examination & verification list, make the capacity dimension of Hash examination & verification list be held in suitable scale, the calculated amount while avoiding too increasing cryptographic hash comparison.Maximum parameter effective time, can need to arrange according to actual application, and the size of the larger Hash examination & verification of maximum parameter effective time list is often larger, and maximum parameter effective time is less needs the follow-up probability of auditing by sensitive word often larger.
Among the step S510-S570 of the present embodiment, by examination & verification number of times is set, when the effective content for thering is same Hash digest value, when what judgement was audited continuously for several times comes to the same thing, represent that same effective content occurs repeatedly, now Hash digest is added among Hash examination & verification list, can audit list by real-time update Hash, improve the probability of examination & verification automatically; By the result of record sheet record examination & verification is set, according to the auditing result amendment Hash examination & verification list of record, in the time that the auditing result for same effective content is different, can among auditing list, Hash not increase Storage Item newly, can avoid revising too continually Hash examination & verification list.Certainly, also can further in each record of record sheet, effective time parameter be set, successively decrease in time and reset in the time that record is modified, identical with parameter role effective time in the Storage Item of Hash examination & verification list.
Among the present embodiment, preferably, the default minimum value of examination & verification number of times can be examination & verification number of times initial in the time of newly-increased record, for example, be 0; The max-thresholds of examination & verification number of times can be preset according to the actual needs, and the number that also can audit the existing Storage Item of list according to Hash is dynamically adjusted.For example, when the number of the existing Storage Item of Hash examination & verification list is enough large, dynamically increase max-thresholds, when the number of the existing Storage Item of Hash examination & verification list is less, dynamically reduce max-thresholds.
In addition, by auditing with sensitive word and Hash examination & verification list respectively, can audit respectively general sensitive word and ageing strong content, compare with existing checking method, do not need manual amendment's sensitive word record continually, make the maintenance of system simpler.
The block diagram of a kind of content auditing system that Fig. 3 provides for preferred embodiment of the present invention.The content that content auditing system uses Hash examination & verification list examination & verification user to issue, each Storage Item of Hash examination & verification list comprises cryptographic hash and examination & verification parameter, comprising: data read analytic unit 100, Hash examination & verification unit 200 and content auditing unit 300.
In the present embodiment, data read analytic unit 100, and the data of issuing for reading user are chosen effective content the data of issuing from user, use Hash digest algorithm to calculate the Hash digest value of effective content; Hash examination & verification unit 200, be used for judging whether Hash examination & verification list exists a Storage Item, the cryptographic hash that this Storage Item comprises equals the Hash digest value of effective content, and if be judged as YES the auditing result of the data that the examination & verification parameter among this Storage Item is issued as described user; Content auditing unit 300, for in the time that Hash examination & verification unit judges Hash examination & verification list does not exist cryptographic hash to equal the Storage Item of Hash digest value of effective content, use responsive dictionary to audit effective content, if the sensitive word among described effective content is miss responsive dictionary, obtain the auditing result of the data that described user issues for passing through, the data qualifier of user's issue; If described effective content is hit the sensitive word among responsive dictionary, receiving management people's examination & verification instruction, audits described effective content of hitting sensitive word according to described examination & verification instruction, obtains the auditing result of the data of described user's issue.
The content auditing system of the present embodiment, also comprise Hash examination & verification list processing unit 400, for described effective content of hitting sensitive word being audited according to described examination & verification instruction when content auditing unit 300, after obtaining the auditing result of data of described user's issue, upgrade the record in record sheet according to described auditing result, and upgrade Hash examination & verification list according to the record in record sheet.Particularly, if effectively content is hit the sensitive word in responsive dictionary, the auditing result that obtains the data of described user's issue is upgraded a record in record sheet afterwards, the i.e. record in a newly-increased record or amendment record table in record sheet, each of record sheet record comprises cryptographic hash, examination & verification parameter and examination & verification number of times, after upgrading the record in record sheet, especially after the record in amendment record table, whether the examination & verification number of times that judges this record reaches max-thresholds, if be judged as YES, in record sheet, delete this record and the cryptographic hash in this record and examination & verification parameter are moved in Hash examination & verification list as Storage Item, thereby realize the renewal of Hash examination & verification list, wherein, the cryptographic hash of each record is hit the Hash digest value of effective content of sensitive word described in being, the auditing result of the data that the user corresponding to effective content of sensitive word issue is hit in examination & verification described in Parametric Representation, and examination & verification number of times is the number of times that hits the auditing result of the data that the user corresponding to effective content of sensitive word issue described in obtaining.
More specifically, data read analytic unit 100 read user issue data, choose effective content and calculate Hash digest value, wherein, data read analytic unit 100 data of user's issue are cut to word analysis, and filtering does not have influential punctuation mark and character to choose effective content to context.
Hash examination & verification unit 200 can be compared the cryptographic hash of each Storage Item of the Hash digest value calculating and Hash examination & verification list, judges whether to equate.
Content auditing unit 300 is similar to existing general examination & verification unit: use and comprise that the responsive dictionary of sensitive word audits effective content, judge whether effective content hits the sensitive word in responsive dictionary, and in the time that effectively content is hit sensitive word, receiving management people's examination & verification instruction is to obtain auditing result.
In the present embodiment, each Storage Item of Hash examination & verification list also comprises effective time parameter, and this, parameter can be successively decreased in time effective time; When Hash examination & verification unit 200 judges while existing cryptographic hash to equal the Storage Item of Hash digest value of effective content in Hash examination & verification list, the maximum effective time that effective time, parameter was set to preset among Hash examination & verification unit 200 these Storage Items; Hash examination & verification list processing unit 400, in the time judging that the effective time of a Storage Item, parameter was decremented to 0 in time, deletes this Storage Item.
Among the present embodiment, Hash examination & verification list processing unit 400, can be by using record sheet automatically to adjust Hash examination & verification list, and each of record sheet record comprises cryptographic hash, examination & verification parameter and examination & verification number of times.If content auditing unit 300 judges effective content and hits sensitive word, according to described examination & verification instruction, described effective content of hitting sensitive word is being audited, after obtaining the auditing result of data of described user's issue, Hash examination & verification list processing unit 400 can judge in record sheet, whether there is a record, the cryptographic hash that this record comprises equals the Hash digest value of effective content, revise this record if be judged as YES, if be judged as otherwise a newly-increased record, the cryptographic hash of this newly-increased record equals the Hash digest value of effective content, examination & verification parameter is corresponding examination & verification instruction.
In the present embodiment, Hash examination & verification list processing unit 400, when judging while existing it cryptographic hash comprising to hit the recording of Hash digest value of effective content of sensitive word described in equaling in record sheet, the operation of revising this record specifically comprises: judge whether the auditing result of the data of described user's issue equals the examination & verification parameter of this record, if be judged as YES the examination & verification number of times of this record increased to 1, if be judged as NO, examination & verification number of times is reduced to 1, if examination & verification number of times is less than default minimum value, delete this record.Audit number of times by amendment, in the time that the auditing result of same effective content for hitting sensitive word is different, can among Hash is audited list, not increase Storage Item newly.Certainly, also can further in each record of record sheet, effective time parameter be set, successively decrease in time and reset in the time that record is modified, identical with parameter role effective time in the Storage Item of Hash examination & verification list.
Fig. 4 is the application schematic diagram of content auditing system of the present invention.Fig. 4 has shown that content auditing system of the present invention is applied to the example of examination & verification user distributing data, as seen from the figure, after reading out data access, first chooses effective content, calculates Hash digest value according to effective content of choosing.After the Hash digest value of computational data, mate with the Hash examination & verification list in buffer memory, if hit, directly enter audit log storehouse according to Hash auditing result, Hash auditing result is fed back to service line and carry out the processing of follow-up business line, no longer need further examination & verification.Wherein Hash auditing result is stored with the form of data feedback task list in audit log storehouse.
If do not hit Hash examination & verification list, carry out filtering sensitive words.If do not hit sensitive word, data examination & verification is passed through, and enters audit log storehouse.If hit sensitive word, further audit, for example manual examination and verification.Further carry out auditing result and converge, amendment record table, and upgrade the Hash examination & verification list in buffer memory according to record sheet.
Embodiments of the invention have advantages of following:
(1) by using Hash examination & verification list, for the frequent data that occur in a period of time, automatically audited the manual examination and verification that replaced part by system, reduce the consumption of manpower.
(2) data of user being issued are cut word analysis, choose effective content, and filtering does not have influential punctuation mark and character to context, thereby makes Hash operation more accurate, has strengthened Hash review scope.
(3) utilize Hash digest algorithm to calculate Hash digest value, can realize the quick examination & verification to data.
(4) by dynamically revising Hash examination & verification list, promote the ratio of examination & verification automatically.
(5) by the result of record sheet record examination & verification is set, according to the auditing result amendment Hash examination & verification list of record, the cryptographic hash satisfying condition just can be written into Hash examination & verification list, makes to audit accuracy high and can avoid revising too continually Hash examination & verification list.
(6) by Storage Item low frequency of utilization is deleted from Hash examination & verification list, make the capacity dimension of Hash examination & verification list be held in suitable scale, avoid too increasing the calculated amount when Hash digest value of effective content is compared.
It should be noted that, in this article, term " comprises ", " comprising " or its any other variant are intended to contain comprising of non-exclusionism, thereby the process, method, article or the equipment that make to comprise a series of key elements not only comprise those key elements, but also comprise other key element of clearly not listing, or be also included as the intrinsic key element of this process, method, article or equipment.The in the situation that of more restrictions not, the key element being limited by statement " comprising ... ", and be not precluded within process, method, article or the equipment that comprises described key element and also have other identical element.
One of ordinary skill in the art will appreciate that all or part of step that realizes above-described embodiment can complete by hardware, also can carry out the hardware that instruction is relevant by program completes, described program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium of mentioning can be ROM (read-only memory), disk or CD etc.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited to this, any be familiar with those skilled in the art the present invention disclose technical scope in; the variation that can expect easily or replacement, within all should being encompassed in protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claim.

Claims (10)

1. a content auditing method, is characterized in that, the content that uses Hash examination & verification list examination & verification user to issue, and each Storage Item of Hash examination & verification list comprises cryptographic hash and examination & verification parameter, the method comprises:
A, read the data that user issues, the data of issuing from described user, choose effective content, use Hash digest algorithm to calculate the Hash digest value of described effective content;
B, judge in described Hash examination & verification list whether have a Storage Item, the cryptographic hash that this Storage Item comprises equals the Hash digest value of described effective content, if be judged as YES, performs step C, if be judged as NO, performs step D;
The auditing result of the data that C, examination & verification parameter among this Storage Item are issued as described user;
D, use the described effective content of responsive dictionary examination & verification, if the sensitive word among the miss responsive dictionary of described effective content obtains the auditing result of the data that described user issues for passing through, the data qualifier of user's issue; If described effective content is hit the sensitive word among responsive dictionary, receiving management people's examination & verification instruction, audits described effective content of hitting sensitive word according to described examination & verification instruction, obtains the auditing result of the data of described user's issue.
2. method according to claim 1, is characterized in that,
Describedly according to described examination & verification instruction, described effective content of hitting sensitive word is audited, after obtaining the auditing result of data of described user's issue, the method further comprises step e: upgrade the record in record sheet according to described auditing result, wherein each record in this record sheet comprises cryptographic hash, examination & verification parameter and examination & verification number of times; Whether the examination & verification number of times that judges the record after described renewal reaches max-thresholds, if be judged as YES, in described record sheet, delete this record, and the cryptographic hash in this record and examination & verification parameter are moved in described Hash examination & verification list as Storage Item, realize the renewal of described Hash examination & verification list; Wherein, the cryptographic hash of described each record is hit the Hash digest value of effective content of sensitive word described in being, the auditing result of the data that the user corresponding to effective content of sensitive word issue is hit in examination & verification described in Parametric Representation, and examination & verification number of times is the number of times that hits the auditing result of the data that the user corresponding to effective content of sensitive word issue described in obtaining.
3. method according to claim 2, is characterized in that,
Record in described renewal record sheet, specifically comprises:
Judge in record sheet whether have a record, described in the cryptographic hash that this record comprises equals, hit the Hash digest value of effective content of sensitive word, revise this record if be judged as YES, if be judged as otherwise a newly-increased record.
4. method according to claim 3, is characterized in that,
When judging that while existing it cryptographic hash comprising to hit the recording of Hash digest value of effective content of sensitive word described in equaling in record sheet, this record of described amendment, specifically comprises:
Judge whether the auditing result of the data of described user's issue equals the examination & verification parameter of this record, if be judged as YES, the examination & verification number of times of this record is increased to 1, if be judged as NO, examination & verification number of times is reduced to 1, if examination & verification number of times is less than default minimum value, delete this record.
5. method according to claim 1, is characterized in that,
Each Storage Item of described Hash examination & verification list also comprises effective time parameter, and this, parameter can be successively decreased in time effective time;
Step C further comprises: the maximum effective time that the effective time among this Storage Item, parameter was set to preset;
The method further comprises: in the time that the effective time of a Storage Item, parameter was decremented to 0 in time, delete this Storage Item.
6. according to the method described in the arbitrary claim of claim 1 to 5, it is characterized in that,
The described data of issuing from user, choose effective content, specifically comprise: the data that user is issued are cut word analysis, and filtering does not have influential punctuation mark and character to context.
7. a content auditing system, it is characterized in that, the content that uses Hash examination & verification list examination & verification user to issue, each Storage Item of Hash examination & verification list comprises cryptographic hash and examination & verification parameter, this system comprises: data read analytic unit, Hash examination & verification unit and content auditing unit
Data read analytic unit, and the data of issuing for reading user are chosen effective content the data of issuing from described user, use Hash digest algorithm to calculate the Hash digest value of described effective content;
Hash examination & verification unit, be used for judging whether described Hash examination & verification list exists a Storage Item, the cryptographic hash that this Storage Item comprises equals the Hash digest value of described effective content, if and be judged as YES the auditing result of the data that the examination & verification parameter among this Storage Item is issued as described user;
Content auditing unit, for in the time that Hash examination & verification unit judges Hash examination & verification list does not exist cryptographic hash to equal the Storage Item of Hash digest value of effective content, use responsive dictionary to audit effective content, if the sensitive word among described effective content is miss responsive dictionary, obtain the auditing result of the data that described user issues for passing through, the data qualifier of user's issue; If described effective content is hit the sensitive word among responsive dictionary, receiving management people's examination & verification instruction, audits described effective content of hitting sensitive word according to described examination & verification instruction, obtains the auditing result of the data of described user's issue.
8. system according to claim 7, is characterized in that,
This system also comprises Hash examination & verification list processing unit, for described effective content of hitting sensitive word being audited according to described examination & verification instruction when content auditing unit, after obtaining the auditing result of data of described user's issue, upgrade the record in record sheet according to described auditing result, wherein each record in this record sheet comprises cryptographic hash, examination & verification parameter and examination & verification number of times; Whether the examination & verification number of times that judges the record after described renewal reaches max-thresholds, if be judged as YES, in described record sheet, delete this record, and the cryptographic hash in this record and examination & verification parameter are moved in described Hash examination & verification list as Storage Item, realize the renewal of described Hash examination & verification list; Wherein, the cryptographic hash of described each record is hit the Hash digest value of effective content of sensitive word described in being, the auditing result of the data that the user corresponding to effective content of sensitive word issue is hit in examination & verification described in Parametric Representation, and examination & verification number of times is the number of times that hits the auditing result of the data that the user corresponding to effective content of sensitive word issue described in obtaining.
9. system according to claim 8, is characterized in that,
Hash examination & verification list processing unit specifically for: when judging while existing it cryptographic hash comprising to hit the recording of Hash digest value of effective content of sensitive word described in equaling in record sheet, judge whether the auditing result of the data of described user's issue equals the examination & verification parameter of this record, if be judged as YES, the examination & verification number of times of this record is increased to 1, if be judged as NO, examination & verification number of times is reduced to 1, if examination & verification number of times is less than default minimum value, delete this record.
10. according to the system described in the arbitrary claim of claim 7 to 9, it is characterized in that, each Storage Item of Hash examination & verification list also comprises effective time parameter, and this, parameter can be successively decreased in time effective time;
Hash examination & verification unit is further used for: when judge that Hash audits while existing cryptographic hash to equal the Storage Item of Hash digest value of effective content in list, the maximum effective time that the effective time among this Storage Item, parameter was set to preset;
Hash examination & verification list processing unit is further used for: in the time that the effective time of a Storage Item, parameter was decremented to 0 in time, delete this Storage Item;
Data read analytic unit, cut word analysis specifically for the data that user is issued, and filtering does not have influential punctuation mark and character to choose effective content to context.
CN201210559036.4A 2012-12-20 2012-12-20 A kind of content auditing method and system Active CN103885964B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210559036.4A CN103885964B (en) 2012-12-20 2012-12-20 A kind of content auditing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210559036.4A CN103885964B (en) 2012-12-20 2012-12-20 A kind of content auditing method and system

Publications (2)

Publication Number Publication Date
CN103885964A true CN103885964A (en) 2014-06-25
CN103885964B CN103885964B (en) 2017-06-27

Family

ID=50954859

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210559036.4A Active CN103885964B (en) 2012-12-20 2012-12-20 A kind of content auditing method and system

Country Status (1)

Country Link
CN (1) CN103885964B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106327140A (en) * 2015-06-25 2017-01-11 阿里巴巴集团控股有限公司 Method and device for monitoring data modification
CN108768840A (en) * 2018-06-12 2018-11-06 北京京东金融科技控股有限公司 A kind of method and apparatus of account management
CN110602251A (en) * 2019-09-30 2019-12-20 腾讯科技(深圳)有限公司 Data processing method, device, apparatus and medium based on inter-node data sharing
CN111126928A (en) * 2018-10-29 2020-05-08 阿里巴巴集团控股有限公司 Method and device for auditing release content
CN112529700A (en) * 2020-12-29 2021-03-19 平安消费金融有限公司 Business handling and auditing method, system, equipment and readable storage medium
CN112749420A (en) * 2020-12-23 2021-05-04 上海同态信息科技有限责任公司 Private data cooperation method taking hash function as attribute

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030009495A1 (en) * 2001-06-29 2003-01-09 Akli Adjaoute Systems and methods for filtering electronic content
US20030009365A1 (en) * 2001-01-09 2003-01-09 Dermot Tynan System and method of content management and distribution
US20080005086A1 (en) * 2006-05-17 2008-01-03 Moore James F Certificate-based search
CN101594316A (en) * 2008-05-30 2009-12-02 华为技术有限公司 A kind of management method of distributed network, content search method, system and device
CN102184245A (en) * 2011-05-18 2011-09-14 华北电力大学 Method for fast searching massive text data keywords
CN103186669A (en) * 2013-03-21 2013-07-03 厦门雅迅网络股份有限公司 Method for rapidly filtering key word

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030009365A1 (en) * 2001-01-09 2003-01-09 Dermot Tynan System and method of content management and distribution
US20030009495A1 (en) * 2001-06-29 2003-01-09 Akli Adjaoute Systems and methods for filtering electronic content
US20080005086A1 (en) * 2006-05-17 2008-01-03 Moore James F Certificate-based search
CN101594316A (en) * 2008-05-30 2009-12-02 华为技术有限公司 A kind of management method of distributed network, content search method, system and device
CN102184245A (en) * 2011-05-18 2011-09-14 华北电力大学 Method for fast searching massive text data keywords
CN103186669A (en) * 2013-03-21 2013-07-03 厦门雅迅网络股份有限公司 Method for rapidly filtering key word

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
詹川 等: ""基于签名的近似垃圾邮件检测算法"", 《计算机工程》 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106327140A (en) * 2015-06-25 2017-01-11 阿里巴巴集团控股有限公司 Method and device for monitoring data modification
CN106327140B (en) * 2015-06-25 2019-12-06 阿里巴巴集团控股有限公司 Method and device for monitoring data modification
CN108768840A (en) * 2018-06-12 2018-11-06 北京京东金融科技控股有限公司 A kind of method and apparatus of account management
CN111126928A (en) * 2018-10-29 2020-05-08 阿里巴巴集团控股有限公司 Method and device for auditing release content
CN111126928B (en) * 2018-10-29 2024-03-22 阿里巴巴集团控股有限公司 Method and device for auditing release content
CN110602251A (en) * 2019-09-30 2019-12-20 腾讯科技(深圳)有限公司 Data processing method, device, apparatus and medium based on inter-node data sharing
CN110602251B (en) * 2019-09-30 2021-12-17 腾讯科技(深圳)有限公司 Data processing method, device, apparatus and medium based on inter-node data sharing
CN112749420A (en) * 2020-12-23 2021-05-04 上海同态信息科技有限责任公司 Private data cooperation method taking hash function as attribute
CN112529700A (en) * 2020-12-29 2021-03-19 平安消费金融有限公司 Business handling and auditing method, system, equipment and readable storage medium

Also Published As

Publication number Publication date
CN103885964B (en) 2017-06-27

Similar Documents

Publication Publication Date Title
CN103885964A (en) Content checking method and system
CN103778148B (en) Life cycle management method and equipment for data file of Hadoop distributed file system
US10115058B2 (en) Predictive modeling
CN101866364B (en) Data lead-in method and device
US7953762B2 (en) Infrastructure and architecture for development and execution of predictive models
WO2005119547A3 (en) System and method for organizing price modeling data using hierarchically organized portfolios
CN106372798A (en) User customization contract generation method based on risks and system
US10467252B1 (en) Document classification and characterization using human judgment, tiered similarity analysis and language/concept analysis
CN104679646B (en) A kind of method and apparatus for detecting SQL code defect
CN104967587A (en) Method for identifying malicious account numbers, and apparatus thereof
CN105956119A (en) Patent write auxiliary system and method
CN105930375B (en) A kind of data digging method based on XBRL file
KR101505546B1 (en) Keyword extracting method using text mining
CN105808602B (en) Method and device for detecting junk information
JP2007148946A (en) Unauthorized access detection method
CN113449753B (en) Service risk prediction method, device and system
CN104636341A (en) Data cleaning storage method for added value tax one-number multi-name monitoring
CN104462462A (en) Service change frequency based data warehouse modeling method and device
CN108614825A (en) A kind of web page characteristics extracting method and device
CN110941952A (en) Method and device for perfecting audit analysis model
CN103761243A (en) Detection method and device for target document
CN101980209A (en) Adaptive multi-field search engine calling method and system
Menzies et al. Reusing models for requirements engineering
JP5651570B2 (en) Evaluation support device, evaluation support method, evaluation support program
CN101739431A (en) String matching method and system of self-adjusting parameter

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder

Address after: Room 810, 8 / F, 34 Haidian Street, Haidian District, Beijing 100080

Patentee after: BEIJING D-MEDIA COMMUNICATION TECHNOLOGY Co.,Ltd.

Address before: 100089 Beijing city Haidian District wanquanzhuang Road No. 28 Wanliu new building 6 storey block A room 602

Patentee before: BEIJING D-MEDIA COMMUNICATION TECHNOLOGY Co.,Ltd.

CP02 Change in the address of a patent holder