?? ?年初領導讓做一個檢索熱詞的干預,也就是將統計用戶搜索熱詞的結果,人工的指定其在排行榜中的位置。當然這任務比較惡心,咱只是個出來混飯碗的民工,不出格的事兒也可以忍了
?? ?說技術。工作流程是收集用戶的搜索日志,統計每個keyword在一天之中被搜索的次數,根據每個keyword的統計歷史,使用數學方差得出它近期熱度的評分,然后降序排序給出結果列表。(如果做的更細致可以在計算前加入語義分析的部分,這樣能更好的分析出剛剛流行的網絡用語,我沒有做那么深,這里暫時不表)
?? ?現在加入人工干預的部分,排行本來就是個topN的問題,干預的也是排行的前幾個。編輯向來喜歡簡單直接粗暴的方法,把某個關鍵詞直接指定他的位置,也就是位置(priority)與得分(score)的混合排序。priority實際上就可以認為是排名的優先級,所以組合排序的策略按priority降序,score降序。
?? ?在map/reduce框架下,排序沒啥子技術含量,只需要簡單調用方法告知job需要排序的key的類型。但多字段排序,需要實現WritableComparable接口的自定義Writable類型來作為排序的key,也很簡單。網上hadoop的中文資料比較少,我愛好裝B但缺少hadoop編程的硬貨,寫出這個難免讓您賤笑了。。
不說廢話,直接上代碼
1、KeyWritable.java
1
public
static
class
KeyWritable
implements
WritableComparable
<
KeyWritable
>
{
2
3
private
IntWritable priority;
4
private
FloatWritable score;
5
6
public
KeyWritable(){
7
priority
=
new
IntWritable(
0
);
8
score
=
new
FloatWritable(
0
);
9
}
10
11
public
KeyWritable(IntWritable priority,FloatWritable score) {
12
set(priority,score);
13
}
14
15
public
KeyWritable(
int
priority,
long
score) {
16
set(
new
IntWritable(priority),
new
FloatWritable(score));
17
}
18
19
public
void
set(IntWritable priority,FloatWritable score){
20
this
.priority
=
priority;
21
this
.score
=
score;
22
}
23
24
public
IntWritable getPriority(){
25
return
this
.priority;
26
}
27
28
public
FloatWritable getScore(){
29
return
this
.score;
30
}
31
32
@Override
33
public
void
readFields(DataInput in)
throws
IOException {
34
this
.priority.readFields(in);
35
this
.score.readFields(in);
36
37
}
38
39
@Override
40
public
void
write(DataOutput out)
throws
IOException {
41
this
.priority.write(out);
42
this
.score.write(out);
43
}
44
45
@Override
46
public
int
compareTo(KeyWritable obj) {
47
int
cmp
=
this
.priority.compareTo(obj.priority);
48
if
(cmp
!=
0
){
49
return
cmp;
50
}
51
return
this
.score.compareTo(obj.score);
52
}
53
54
@Override
55
public
boolean
equals(Object obj) {
56
if
(obj
instanceof
KeyWritable){
57
int
result
=
this
.compareTo((KeyWritable)obj);
58
if
(result
==
0
){
59
return
true
;
60
}
61
}
62
return
false
;
63
}
64
65
@Override
66
public
int
hashCode() {
67
return
score.hashCode();
68
}
69
70
@Override
71
public
String toString() {
72
return
super
.toString();
73
}
74
75
76
/**
77
* Comparator
78
*
@author
zhangmiao
79
*
80
*/
81
public
static
class
Comparator
extends
WritableComparator {
82
public
Comparator() {
83
super
(KeyWritable.
class
);
84
}
85
86
@Override
87
public
int
compare(
byte
[] b1,
int
s1,
int
l1,
byte
[] b2,
88
int
s2,
int
l2) {
89
KeyWritable key1
=
new
KeyWritable();
90
KeyWritable key2
=
new
KeyWritable();
91
DataInputBuffer buffer
=
new
DataInputBuffer();
92
93
try
{
94
95
buffer.reset(b1, s1, l1);
96
key1.readFields(buffer);
97
buffer.reset(b2, s2, l2);
98
key2.readFields(buffer);
99
}
catch
(IOException e) {
100
throw
new
RuntimeException(e);
101
}
102
return
compare(key1, key2);
103
}
104
105
@Override
106
public
int
compare(WritableComparable a,WritableComparable b){
107
if
(a
instanceof
KeyWritable
&&
b
instanceof
KeyWritable) {
108
return
((KeyWritable) a).compareTo(((KeyWritable) b));
109
}
110
return
super
.compare(a, b);
111
}
112
113
}
114
115
public
static
class
DecreasingComparator
extends
Comparator {
116
117
@Override
118
public
int
compare(
byte
[] b1,
int
s1,
int
l1,
byte
[] b2,
int
s2,
int
l2){
119
return
-
super
.compare(b1,s1,l1,b2,s2,l2);
120
}
121
}
122
}
2、在提交job設置KeyWritable比較器
job.setOutputKeyComparatorClass(KeyWritable.DecreasingComparator.
class
);
(未完待續)
更多文章、技術交流、商務合作、聯系博主
微信掃碼或搜索:z360901061
微信掃一掃加我為好友
QQ號聯系: 360901061
您的支持是博主寫作最大的動力,如果您喜歡我的文章,感覺我的文章對您有幫助,請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧,狠狠點擊下面給點支持吧,站長非常感激您!手機微信長按不能支付解決辦法:請將微信支付二維碼保存到相冊,切換到微信,然后點擊微信右上角掃一掃功能,選擇支付二維碼完成支付。
【本文對您有幫助就好】元

