日韩AV 无码区,久久大香蕉网欧美

一、net分布式緩存框架哪個好？

我們公司用的也是力軟的，工作流都有的，比較全，還有技術(shù)支持的

二、分布式機器學習面試題

在今天的技術(shù)領(lǐng)域中，分布式機器學習一直是一個備受關(guān)注的熱門話題。無論是從理論研究到實際應(yīng)用，分布式機器學習都具有巨大的潛力和挑戰(zhàn)。今天我們將深入探討關(guān)于分布式機器學習的面試題，幫助讀者更好地了解這一領(lǐng)域的知識。

什么是分布式機器學習？

分布式機器學習是一種利用多臺機器進行數(shù)據(jù)處理和模型訓(xùn)練的技術(shù)。與傳統(tǒng)的集中式機器學習不同，分布式機器學習可以更好地處理大規(guī)模數(shù)據(jù)和高維度模型，提高訓(xùn)練效率和模型性能。

常見的分布式機器學習框架有哪些？

在實際應(yīng)用中，有一些常見的分布式機器學習框架被廣泛采用，例如TensorFlow、PyTorch、Apache Spark等。這些框架提供了豐富的API和工具，幫助開發(fā)者更輕松地構(gòu)建和部署分布式機器學習模型。

分布式機器學習面試題示例

以下是一些常見的分布式機器學習面試題示例，供讀者參考：

什么是MapReduce？如何在分布式機器學習中使用MapReduce？
什么是參數(shù)服務(wù)器(Parameter Server)？它在分布式機器學習中的作用是什么？
如何設(shè)計一個高效的分布式機器學習算法？
分布式機器學習中的數(shù)據(jù)同步和數(shù)據(jù)通信有哪些常見的方式？
為什么在分布式機器學習中要考慮負載均衡？

如何準備分布式機器學習面試？

為了更好地準備分布式機器學習面試，考生可以從以下幾個方面進行準備：

深入理解分布式機器學習原理：要了解分布式機器學習的基本原理和常見算法，包括MapReduce、Parameter Server等。
掌握常見的分布式機器學習框架：熟悉TensorFlow、PyTorch、Apache Spark等框架的基本用法和特點。
解決實際問題：通過實際項目練習，熟悉如何應(yīng)用分布式機器學習解決實際問題。
參加模擬面試：參加模擬面試，了解自己在分布式機器學習領(lǐng)域的不足，并針對性地進行提升。

總結(jié)

分布式機器學習作為一項重要的技術(shù)，不僅在學術(shù)研究領(lǐng)域具有重要意義，也在工業(yè)實踐中發(fā)揮著關(guān)鍵作用。通過深入理解分布式機器學習的原理和框架，以及通過練習和模擬面試的方式進行準備，相信讀者可以在面試中取得好的成績。希望本文對大家有所幫助，祝大家在分布式機器學習面試中取得成功！

三、redis是怎么分布式緩存數(shù)據(jù)的？

Redis使用單線程的IO復(fù)用模型，自己封裝了一個簡單的AeEvent事件處理框架，主要實現(xiàn)了epoll、kqueue和select，對于單純只有IO操作來說，單線程可以將速度優(yōu)勢發(fā)揮到最大，但是Redis也提供了一些簡單的計算功能比如排序、聚合等，對于這些操作，單線程模型實際會嚴重影響整體吞吐量，CPU計算過程中，整個IO調(diào)度都是被阻塞住的。

四、redis分布式鎖可以預(yù)防緩存擊穿嗎？

是的，redis分布式鎖可以預(yù)防緩存擊穿。緩存擊穿是指在高并發(fā)情況下，某個熱點數(shù)據(jù)的緩存失效，導(dǎo)致大量請求直接訪問數(shù)據(jù)庫，造成數(shù)據(jù)庫壓力過大。為了解決這個問題，可以使用分布式鎖來保證只有一個線程能夠重新生成緩存。使用redis分布式鎖可以實現(xiàn)對熱點數(shù)據(jù)的互斥訪問，保證只有一個線程能夠重新生成緩存。在獲取鎖之前，其他線程會被阻塞，直到鎖被釋放。這樣可以避免多個線程同時去訪問數(shù)據(jù)庫，減輕數(shù)據(jù)庫的壓力。需要注意的是，使用分布式鎖也會帶來一定的性能開銷，因為需要進行網(wǎng)絡(luò)通信和鎖的競爭。因此，在使用分布式鎖時需要權(quán)衡性能和數(shù)據(jù)一致性的需求，合理選擇鎖的粒度和使用方式。

五、Memcached分布式緩存實現(xiàn)原理是什么呢？

memcached 雖然稱為 “ 分布式 ” 緩存服務(wù)器，但服務(wù)器端并沒有 “ 分布式 ” 功能。每個服務(wù)器都是完全獨立和隔離的服務(wù)。 memcached 的分布式，則是完全由客戶端程序庫實現(xiàn)的。這種分布式是 memcached 的最大特點。分布式原理這里多次使用了 “ 分布式 ” 這個詞，但并未做詳細解釋。現(xiàn)在開始簡單地介紹一下其原理，各個客戶端的實現(xiàn)基本相同。下面假設(shè) memcached 服務(wù)器有 node1 ～ node3 三臺，應(yīng)用程序要保存鍵名為“tokyo”“kanagawa”“chiba”“saitama”“gunma” 的數(shù)據(jù)。

六、hadoop分布式緩存必須在hdfs上嗎？

是。Hadoop必須快速處理這些數(shù)據(jù)集，而且要通過Hadoop分布式文件系統(tǒng) (HDFS)實現(xiàn)。HDFS本質(zhì)上將計算轉(zhuǎn)移到數(shù)據(jù)，而不是傳輸數(shù)據(jù)到計算。

七、分布式電商項目面試題庫

分布式電商項目面試題庫

隨著互聯(lián)網(wǎng)技術(shù)的飛速發(fā)展，電子商務(wù)在現(xiàn)代社會中扮演著日益重要的角色。對于從事分布式電商項目開發(fā)的技術(shù)人員來說，掌握相關(guān)的面試題目是至關(guān)重要的。本篇博客將整理并分享一些常見的分布式電商項目面試題庫，幫助讀者更好地準備面試。

一、分布式系統(tǒng)概述

1. 什么是分布式系統(tǒng)？

分布式系統(tǒng)是由多臺計算機通過網(wǎng)絡(luò)連接組成的系統(tǒng)，這些計算機通過消息傳遞進行通信和協(xié)作，共同提供某種服務(wù)。

2. 分布式系統(tǒng)的特點有哪些？

分布性
并發(fā)性
缺乏全局時鐘
故障一致性
擴展性

二、分布式電商項目常見面試題

1. 什么是電子商務(wù)？

電子商務(wù)是指借助電子通信技術(shù)，將交易的各個環(huán)節(jié)都電子化，從而實現(xiàn)商務(wù)活動的增值、管理的創(chuàng)新、服務(wù)的個性化和貿(mào)易方式的革命性轉(zhuǎn)變。

2. 分布式電商系統(tǒng)架構(gòu)有哪些關(guān)鍵技術(shù)？

負載均衡
分布式緩存
分布式數(shù)據(jù)庫
消息隊列
分布式事務(wù)

3. 為什么分布式系統(tǒng)需要考慮數(shù)據(jù)一致性？

在分布式系統(tǒng)中，不同節(jié)點之間數(shù)據(jù)的復(fù)制和同步可能導(dǎo)致數(shù)據(jù)一致性問題，因此需要采取相應(yīng)的機制來保證數(shù)據(jù)的一致性。

4. 分布式系統(tǒng)中的CAP理論是什么？

CAP理論指出，一個分布式系統(tǒng)無法同時滿足一致性（Consistency）、可用性（Availability）和分區(qū)容忍性（Partition Tolerance）這三個特性，只能在這三者之間取舍。

5. 分布式系統(tǒng)常用的消息中間件有哪些？

常見的消息中間件包括Kafka、RabbitMQ、ActiveMQ等，它們可以實現(xiàn)分布式系統(tǒng)中消息的異步發(fā)送和接收，保證系統(tǒng)之間的解耦和可靠性。

6. 什么是分布式事務(wù)？如何保證分布式事務(wù)的一致性？

分布式事務(wù)是指同時跨多個節(jié)點的一組操作，為保證分布式事務(wù)的一致性，可以采用兩階段提交（Two-Phase Commit）協(xié)議或補償事務(wù)（Compensating Transaction）等機制。

7. 如何保證分布式系統(tǒng)的數(shù)據(jù)安全性？

可以通過數(shù)據(jù)加密、訪問控制、審計日志等手段來保證分布式系統(tǒng)中數(shù)據(jù)的安全性，同時也需要定期進行安全漏洞掃描和修復(fù)。

三、總結(jié)

通過對分布式電商項目面試題庫的學習和掌握，可以幫助準備面試的技術(shù)人員更好地理解分布式系統(tǒng)的相關(guān)概念和技術(shù)，并為未來在分布式電商項目中的工作提供堅實的基礎(chǔ)。希望本篇博客能對讀者有所幫助，祝大家面試順利！

八、分布式緩存一致性解決方案？

回答如下：分布式緩存一致性解決方案包括：

1.緩存更新策略：采用先更新緩存，后更新數(shù)據(jù)庫的方式，保證緩存與數(shù)據(jù)庫的一致性。

2.分布式鎖：采用分布式鎖來保證同時只有一個節(jié)點可以修改緩存，避免多個節(jié)點同時修改緩存導(dǎo)致數(shù)據(jù)不一致。

3.緩存失效策略：采用緩存失效機制來保證緩存數(shù)據(jù)的時效性，避免緩存數(shù)據(jù)過期后繼續(xù)使用。

4.數(shù)據(jù)同步策略：采用數(shù)據(jù)同步機制來保證多個緩存節(jié)點之間的數(shù)據(jù)一致性，例如采用廣播機制或者訂閱/發(fā)布機制來實現(xiàn)數(shù)據(jù)同步。

5.一致性哈希算法：采用一致性哈希算法來實現(xiàn)緩存數(shù)據(jù)的分布式存儲，避免節(jié)點之間的負載不均衡導(dǎo)致數(shù)據(jù)不一致。

九、分布式和非分布式區(qū)別

分布式系統(tǒng)和非分布式系統(tǒng)的主要區(qū)別有以下幾點：

1. 資源共享方式：非分布式系統(tǒng)中，所有資源都集中在一個地方，由單個設(shè)備或主機維護，而分布式系統(tǒng)中則將資源分散到不同的設(shè)備或服務(wù)器上，通過網(wǎng)絡(luò)實現(xiàn)共享。

2. 可靠性：分布式系統(tǒng)比非分布式系統(tǒng)更具有容錯性和可靠性。因為在分布式系統(tǒng)中，資源備份和冗余是常規(guī)做法，即使某一臺設(shè)備或服務(wù)器發(fā)生故障，其他設(shè)備或服務(wù)器仍能保證系統(tǒng)的可用性。

3. 性能：在非分布式系統(tǒng)中，所有任務(wù)由一臺設(shè)備或主機處理，因此該設(shè)備或主機的性能會成為系統(tǒng)整體性能的瓶頸。而分布式系統(tǒng)中，任務(wù)可以并行處理，每個設(shè)備或服務(wù)器可以通過負載均衡技術(shù)均衡處理任務(wù)，從而大大提高了系統(tǒng)的處理能力和性能。

4. 安全性：分布式系統(tǒng)中由于數(shù)據(jù)分散在多臺設(shè)備或服務(wù)器中，并且通過網(wǎng)絡(luò)進行傳輸，因此需要更嚴格的安全措施來保護數(shù)據(jù)的安全性，防止數(shù)據(jù)泄露和攻擊。

5. 維護成本：分布式系統(tǒng)相對于非分布式系統(tǒng)來說，維護和管理成本更高，因為需要管理多個設(shè)備或服務(wù)器，并且分布式系統(tǒng)的網(wǎng)絡(luò)拓撲結(jié)構(gòu)較為復(fù)雜，因此需要專業(yè)技術(shù)人員進行維護和管理。

十、mahout面試題？

之前看了Mahout官方示例 20news 的調(diào)用實現(xiàn)；于是想根據(jù)示例的流程實現(xiàn)其他例子。網(wǎng)上看到了一個關(guān)于天氣適不適合打羽毛球的例子。

訓(xùn)練數(shù)據(jù)：

Day Outlook Temperature Humidity Wind PlayTennis

D1 Sunny Hot High Weak No

D2 Sunny Hot High Strong No

D3 Overcast Hot High Weak Yes

D4 Rain Mild High Weak Yes

D5 Rain Cool Normal Weak Yes

D6 Rain Cool Normal Strong No

D7 Overcast Cool Normal Strong Yes

D8 Sunny Mild High Weak No

D9 Sunny Cool Normal Weak Yes

D10 Rain Mild Normal Weak Yes

D11 Sunny Mild Normal Strong Yes

D12 Overcast Mild High Strong Yes

D13 Overcast Hot Normal Weak Yes

D14 Rain Mild High Strong No

檢測數(shù)據(jù)：

sunny，hot，high，weak

結(jié)果：

Yes=》 0.007039

No=》 0.027418

于是使用Java代碼調(diào)用Mahout的工具類實現(xiàn)分類。

基本思想：

1. 構(gòu)造分類數(shù)據(jù)。

2. 使用Mahout工具類進行訓(xùn)練，得到訓(xùn)練模型。

3。將要檢測數(shù)據(jù)轉(zhuǎn)換成vector數(shù)據(jù)。

4. 分類器對vector數(shù)據(jù)進行分類。

接下來貼下我的代碼實現(xiàn)=》

1. 構(gòu)造分類數(shù)據(jù)：

在hdfs主要創(chuàng)建一個文件夾路徑 /zhoujainfeng/playtennis/input 并將分類文件夾 no 和 yes 的數(shù)據(jù)傳到hdfs上面。

數(shù)據(jù)文件格式，如D1文件內(nèi)容： Sunny Hot High Weak

2. 使用Mahout工具類進行訓(xùn)練，得到訓(xùn)練模型。

3。將要檢測數(shù)據(jù)轉(zhuǎn)換成vector數(shù)據(jù)。

4. 分類器對vector數(shù)據(jù)進行分類。

這三步，代碼我就一次全貼出來；主要是兩個類 PlayTennis1 和 BayesCheckData = =》

package myTesting.bayes;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.util.ToolRunner;

import org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob;

import org.apache.mahout.text.SequenceFilesFromDirectory;

import org.apache.mahout.vectorizer.SparseVectorsFromSequenceFiles;

public class PlayTennis1 {

private static final String WORK_DIR = "hdfs://192.168.9.72:9000/zhoujianfeng/playtennis";

* 測試代碼

public static void main(String[] args) {

//將訓(xùn)練數(shù)據(jù)轉(zhuǎn)換成 vector數(shù)據(jù)

makeTrainVector();

//產(chǎn)生訓(xùn)練模型

makeModel(false);

//測試檢測數(shù)據(jù)

BayesCheckData.printResult();

}

public static void makeCheckVector(){

//將測試數(shù)據(jù)轉(zhuǎn)換成序列化文件

try {

Configuration conf = new Configuration();

conf.addResource(new Path("/usr/local/hadoop/conf/core-site.xml"));

String input = WORK_DIR+Path.SEPARATOR+"testinput";

String output = WORK_DIR+Path.SEPARATOR+"tennis-test-seq";

Path in = new Path(input);

Path out = new Path(output);

FileSystem fs = FileSystem.get(conf);

if(fs.exists(in)){

if(fs.exists(out)){

//boolean參數(shù)是，是否遞歸刪除的意思

fs.delete(out, true);

}

SequenceFilesFromDirectory sffd = new SequenceFilesFromDirectory();

String[] params = new String[]{"-i",input,"-o",output,"-ow"};

ToolRunner.run(sffd, params);

}

} catch (Exception e) {

// TODO Auto-generated catch block

e.printStackTrace();

System.out.println("文件序列化失??！");

System.exit(1);

}

//將序列化文件轉(zhuǎn)換成向量文件

try {

Configuration conf = new Configuration();

conf.addResource(new Path("/usr/local/hadoop/conf/core-site.xml"));

String input = WORK_DIR+Path.SEPARATOR+"tennis-test-seq";

String output = WORK_DIR+Path.SEPARATOR+"tennis-test-vectors";

Path in = new Path(input);

Path out = new Path(output);

FileSystem fs = FileSystem.get(conf);

if(fs.exists(in)){

if(fs.exists(out)){

//boolean參數(shù)是，是否遞歸刪除的意思

fs.delete(out, true);

}

SparseVectorsFromSequenceFiles svfsf = new SparseVectorsFromSequenceFiles();

String[] params = new String[]{"-i",input,"-o",output,"-lnorm","-nv","-wt","tfidf"};

ToolRunner.run(svfsf, params);

}

} catch (Exception e) {

// TODO Auto-generated catch block

e.printStackTrace();

System.out.println("序列化文件轉(zhuǎn)換成向量失??！");

System.out.println(2);

}

public static void makeTrainVector(){

//將測試數(shù)據(jù)轉(zhuǎn)換成序列化文件

try {

Configuration conf = new Configuration();

conf.addResource(new Path("/usr/local/hadoop/conf/core-site.xml"));

String input = WORK_DIR+Path.SEPARATOR+"input";

String output = WORK_DIR+Path.SEPARATOR+"tennis-seq";

Path in = new Path(input);

Path out = new Path(output);

FileSystem fs = FileSystem.get(conf);

if(fs.exists(in)){

if(fs.exists(out)){

//boolean參數(shù)是，是否遞歸刪除的意思

fs.delete(out, true);

}

SequenceFilesFromDirectory sffd = new SequenceFilesFromDirectory();

String[] params = new String[]{"-i",input,"-o",output,"-ow"};

ToolRunner.run(sffd, params);

}

} catch (Exception e) {

// TODO Auto-generated catch block

e.printStackTrace();

System.out.println("文件序列化失??！");

System.exit(1);

}

//將序列化文件轉(zhuǎn)換成向量文件

try {

Configuration conf = new Configuration();

conf.addResource(new Path("/usr/local/hadoop/conf/core-site.xml"));

String input = WORK_DIR+Path.SEPARATOR+"tennis-seq";

String output = WORK_DIR+Path.SEPARATOR+"tennis-vectors";

Path in = new Path(input);

Path out = new Path(output);

FileSystem fs = FileSystem.get(conf);

if(fs.exists(in)){

if(fs.exists(out)){

//boolean參數(shù)是，是否遞歸刪除的意思

fs.delete(out, true);

}

SparseVectorsFromSequenceFiles svfsf = new SparseVectorsFromSequenceFiles();

String[] params = new String[]{"-i",input,"-o",output,"-lnorm","-nv","-wt","tfidf"};

ToolRunner.run(svfsf, params);

}

} catch (Exception e) {

// TODO Auto-generated catch block

e.printStackTrace();

System.out.println("序列化文件轉(zhuǎn)換成向量失敗！");

System.out.println(2);

}

public static void makeModel(boolean completelyNB){

try {

Configuration conf = new Configuration();

conf.addResource(new Path("/usr/local/hadoop/conf/core-site.xml"));

String input = WORK_DIR+Path.SEPARATOR+"tennis-vectors"+Path.SEPARATOR+"tfidf-vectors";

String model = WORK_DIR+Path.SEPARATOR+"model";

String labelindex = WORK_DIR+Path.SEPARATOR+"labelindex";

Path in = new Path(input);

Path out = new Path(model);

Path label = new Path(labelindex);

FileSystem fs = FileSystem.get(conf);

if(fs.exists(in)){

if(fs.exists(out)){

//boolean參數(shù)是，是否遞歸刪除的意思

fs.delete(out, true);

}

if(fs.exists(label)){

//boolean參數(shù)是，是否遞歸刪除的意思

fs.delete(label, true);

}

TrainNaiveBayesJob tnbj = new TrainNaiveBayesJob();

String[] params =null;

if(completelyNB){

params = new String[]{"-i",input,"-el","-o",model,"-li",labelindex,"-ow","-c"};

}else{

params = new String[]{"-i",input,"-el","-o",model,"-li",labelindex,"-ow"};

}

ToolRunner.run(tnbj, params);

}

} catch (Exception e) {

// TODO Auto-generated catch block

e.printStackTrace();

System.out.println("生成訓(xùn)練模型失?。?#34;);

System.exit(3);

}

package myTesting.bayes;

import java.io.IOException;

import java.util.HashMap;

import java.util.Map;

import org.apache.commons.lang.StringUtils;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.fs.PathFilter;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.LongWritable;

import org.apache.hadoop.io.Text;

import org.apache.mahout.classifier.naivebayes.BayesUtils;

import org.apache.mahout.classifier.naivebayes.NaiveBayesModel;

import org.apache.mahout.classifier.naivebayes.StandardNaiveBayesClassifier;

import org.apache.mahout.common.Pair;

import org.apache.mahout.common.iterator.sequencefile.PathType;

import org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterable;

import org.apache.mahout.math.RandomAccessSparseVector;

import org.apache.mahout.math.Vector;

import org.apache.mahout.math.Vector.Element;

import org.apache.mahout.vectorizer.TFIDF;

import com.google.common.collect.ConcurrentHashMultiset;

import com.google.common.collect.Multiset;

public class BayesCheckData {

private static StandardNaiveBayesClassifier classifier;

private static Map<String, Integer> dictionary;

private static Map<Integer, Long> documentFrequency;

private static Map<Integer, String> labelIndex;

public void init(Configuration conf){

try {

String modelPath = "/zhoujianfeng/playtennis/model";

String dictionaryPath = "/zhoujianfeng/playtennis/tennis-vectors/dictionary.file-0";

String documentFrequencyPath = "/zhoujianfeng/playtennis/tennis-vectors/df-count";

String labelIndexPath = "/zhoujianfeng/playtennis/labelindex";

dictionary = readDictionnary(conf, new Path(dictionaryPath));

documentFrequency = readDocumentFrequency(conf, new Path(documentFrequencyPath));

labelIndex = BayesUtils.readLabelIndex(conf, new Path(labelIndexPath));

NaiveBayesModel model = NaiveBayesModel.materialize(new Path(modelPath), conf);

classifier = new StandardNaiveBayesClassifier(model);

} catch (IOException e) {

// TODO Auto-generated catch block

e.printStackTrace();

System.out.println("檢測數(shù)據(jù)構(gòu)造成vectors初始化時報錯。。。。");

System.exit(4);

}

/**

* 加載字典文件，Key: TermValue； Value：TermID

* @param conf

* @param dictionnaryDir

* @return

private static Map<String, Integer> readDictionnary(Configuration conf, Path dictionnaryDir) {

Map<String, Integer> dictionnary = new HashMap<String, Integer>();

PathFilter filter = new PathFilter() {

@Override

public boolean accept(Path path) {

String name = path.getName();

return name.startsWith("dictionary.file");

}

};

for (Pair<Text, IntWritable> pair : new SequenceFileDirIterable<Text, IntWritable>(dictionnaryDir, PathType.LIST, filter, conf)) {

dictionnary.put(pair.getFirst().toString(), pair.getSecond().get());

}

return dictionnary;

}

/**

* 加載df-count目錄下TermDoc頻率文件，Key: TermID； Value：DocFreq

* @param conf

* @param dictionnaryDir

* @return

private static Map<Integer, Long> readDocumentFrequency(Configuration conf, Path documentFrequencyDir) {

Map<Integer, Long> documentFrequency = new HashMap<Integer, Long>();

PathFilter filter = new PathFilter() {

@Override

public boolean accept(Path path) {

return path.getName().startsWith("part-r");

}

};

for (Pair<IntWritable, LongWritable> pair : new SequenceFileDirIterable<IntWritable, LongWritable>(documentFrequencyDir, PathType.LIST, filter, conf)) {

documentFrequency.put(pair.getFirst().get(), pair.getSecond().get());

}

return documentFrequency;

}

public static String getCheckResult(){

Configuration conf = new Configuration();

conf.addResource(new Path("/usr/local/hadoop/conf/core-site.xml"));

String classify = "NaN";

BayesCheckData cdv = new BayesCheckData();

cdv.init(conf);

System.out.println("init done...............");

Vector vector = new RandomAccessSparseVector(10000);

TFIDF tfidf = new TFIDF();

//sunny，hot，high，weak

Multiset<String> words = ConcurrentHashMultiset.create();

words.add("sunny",1);

words.add("hot",1);

words.add("high",1);

words.add("weak",1);

int documentCount = documentFrequency.get(-1).intValue(); // key=-1時表示總文檔數(shù)

for (Multiset.Entry<String> entry : words.entrySet()) {

String word = entry.getElement();

int count = entry.getCount();

Integer wordId = dictionary.get(word); // 需要從dictionary.file-0文件（tf-vector）下得到wordID，

if (StringUtils.isEmpty(wordId.toString())){

continue;

}

if (documentFrequency.get(wordId) == null){

continue;

}

Long freq = documentFrequency.get(wordId);

double tfIdfValue = tfidf.calculate(count, freq.intValue(), 1, documentCount);

vector.setQuick(wordId, tfIdfValue);

}

// 利用貝葉斯算法開始分類,并提取得分最好的分類label

Vector resultVector = classifier.classifyFull(vector);

double bestScore = -Double.MAX_VALUE;

int bestCategoryId = -1;

for(Element element: resultVector.all()) {

int categoryId = element.index();

double score = element.get();

System.out.println("categoryId:"+categoryId+" score:"+score);

if (score > bestScore) {

bestScore = score;

bestCategoryId = categoryId;

}

classify = labelIndex.get(bestCategoryId)+"(categoryId="+bestCategoryId+")";

return classify;

}

public static void printResult(){

System.out.println("檢測所屬類別是："+getCheckResult());

}

net分布式緩存框架哪個好？

一、net分布式緩存框架哪個好？

二、分布式機器學習面試題

什么是分布式機器學習？

常見的分布式機器學習框架有哪些？

分布式機器學習面試題示例

如何準備分布式機器學習面試？

總結(jié)

三、redis是怎么分布式緩存數(shù)據(jù)的？

四、redis分布式鎖可以預(yù)防緩存擊穿嗎？

五、Memcached分布式緩存實現(xiàn)原理是什么呢？

六、hadoop分布式緩存必須在hdfs上嗎？

七、分布式電商項目面試題庫

分布式電商項目面試題庫

一、分布式系統(tǒng)概述

二、分布式電商項目常見面試題

三、總結(jié)

八、分布式緩存一致性解決方案？

九、分布式和非分布式區(qū)別

十、mahout面試題？

相關(guān)資訊

熱門頻道

最新發(fā)布

熱門排行

net分布式緩存框架哪個好？

一、net分布式緩存框架哪個好？

二、分布式機器學習面試題

什么是分布式機器學習？

常見的分布式機器學習框架有哪些？

分布式機器學習面試題示例

如何準備分布式機器學習面試？

總結(jié)

三、redis是怎么分布式緩存數(shù)據(jù)的？

四、redis分布式鎖可以預(yù)防緩存擊穿嗎？

五、Memcached分布式緩存實現(xiàn)原理是什么呢？

六、hadoop分布式緩存必須在hdfs上嗎？

七、分布式電商項目面試題庫

分布式電商項目面試題庫

一、分布式系統(tǒng)概述

二、分布式電商項目常見面試題

三、總結(jié)

八、分布式緩存一致性解決方案？

九、分布式和非分布式區(qū)別

十、mahout面試題？

相關(guān)資訊

熱門頻道

最新發(fā)布

熱門排行

net分布式緩存框架哪個好？

一、net分布式緩存框架哪個好？

二、分布式機器學習面試題

什么是分布式機器學習？

常見的分布式機器學習框架有哪些？

如何準備分布式機器學習面試？

三、redis是怎么分布式緩存數(shù)據(jù)的？

四、redis分布式鎖可以預(yù)防緩存擊穿嗎？

五、Memcached分布式緩存實現(xiàn)原理是什么呢？

六、hadoop分布式緩存必須在hdfs上嗎？

七、分布式電商項目面試題庫

一、分布式系統(tǒng)概述

三、總結(jié)

八、分布式緩存一致性解決方案？

九、分布式和非分布式區(qū)別

十、mahout面試題？