-
Notifications
You must be signed in to change notification settings - Fork 0
/
content.json
1 lines (1 loc) · 27.3 KB
/
content.json
1
{"pages":[{"title":"关于","date":"2020-01-16T03:29:07.780Z","path":"about/index.html","text":"数据统计分析与挖掘GithubTelegram"},{"title":"Categories","date":"2020-01-16T02:23:20.351Z","path":"categories/index.html","text":""},{"title":"Tags","date":"2020-01-16T02:23:20.352Z","path":"tags/index.html","text":""}],"posts":[{"title":"相关系数的选择","date":"2020-02-12T16:00:00.000Z","path":"wiki/相关系数/","text":"问题相关系数的大小与相关系数的显著性水平是什么关系?如何解释? 回答 两连续变量:Pearson 相关;两等级变量:Sperarman 相关;一列为连续变量,一列二分变量:点二列相关;多个等评评定:Kendall 和谐系数 @Bob","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"相关","slug":"相关","permalink":"http://yoursite.com/tags/%E7%9B%B8%E5%85%B3/"},{"name":"系数","slug":"系数","permalink":"http://yoursite.com/tags/%E7%B3%BB%E6%95%B0/"},{"name":"变量类型","slug":"变量类型","permalink":"http://yoursite.com/tags/%E5%8F%98%E9%87%8F%E7%B1%BB%E5%9E%8B/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"}]},{"title":"线性回归假设","date":"2020-01-29T16:00:00.000Z","path":"wiki/线性回归假设/","text":"问题线性回归有什么假设? 回答 古典线性模型需要 7 条假定(ucb 的 Bickle 一本书,想不起名字了居然…),如果假定不满足,一条一条的修正就慢慢演化到现代的线性模型。@ld2012 自变量必需已知;可加性和线性。门外汉,小白见解 🙈 @colorfi","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"线性回归","slug":"线性回归","permalink":"http://yoursite.com/tags/%E7%BA%BF%E6%80%A7%E5%9B%9E%E5%BD%92/"},{"name":"假设","slug":"假设","permalink":"http://yoursite.com/tags/%E5%81%87%E8%AE%BE/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"}]},{"title":"正则化","date":"2020-01-28T16:00:00.000Z","path":"wiki/正则化/","text":"问题什么是正则化? 回答正则化简单地讲就是在估计值 $\\hat{\\beta}$ 中加多了一个限制项,得到了一个 $\\hat{\\beta}’$, 这样的话,在数学上可以证明:$||\\hat{\\beta}’|| < ||\\hat{\\beta}||$ 「参考测度论」,所以加入者正则项后,新的估计值 $\\hat{\\beta}’$ 的长度小于原来的估计值 $\\hat{\\beta}$,这样我们就可以避免过拟合。参考链接 @mbpRetina","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"正则化","slug":"正则化","permalink":"http://yoursite.com/tags/%E6%AD%A3%E5%88%99%E5%8C%96/"},{"name":"机器学习","slug":"机器学习","permalink":"http://yoursite.com/tags/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"}]},{"title":"传染病估算","date":"2020-01-22T16:00:00.000Z","path":"wiki/传染病估算/","text":"问题如何估算武汉肺炎全国受感染人数? 回答 我觉得三大运营商可以很方便统计,支付宝还可能有偏 @m1kufan 定位号码在医院的非常驻人员?( @wweiyan","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"估算","slug":"估算","permalink":"http://yoursite.com/tags/%E4%BC%B0%E7%AE%97/"},{"name":"武汉肺炎","slug":"武汉肺炎","permalink":"http://yoursite.com/tags/%E6%AD%A6%E6%B1%89%E8%82%BA%E7%82%8E/"},{"name":"传染病","slug":"传染病","permalink":"http://yoursite.com/tags/%E4%BC%A0%E6%9F%93%E7%97%85/"}],"categories":[{"name":"应用","slug":"应用","permalink":"http://yoursite.com/categories/%E5%BA%94%E7%94%A8/"}]},{"title":"Python 词云制作","date":"2020-01-19T16:00:00.000Z","path":"wiki/Python 词云制作/","text":"问题制作词云有什么简单易的方法或工具? 回答 wordcloud2 tableau finebi 也行@wweiyan mathematica 也行( @clover0722","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"Python","slug":"Python","permalink":"http://yoursite.com/tags/Python/"},{"name":"词云","slug":"词云","permalink":"http://yoursite.com/tags/%E8%AF%8D%E4%BA%91/"}],"categories":[{"name":"应用","slug":"应用","permalink":"http://yoursite.com/categories/%E5%BA%94%E7%94%A8/"}]},{"title":"线性回归多重共线性","date":"2020-01-17T16:00:00.000Z","path":"wiki/线性回归多重共线性/","text":"问题在多元线性回归中,自变量间的多重共线性问题如何判断和处理? 回答【定义】当自变量之间存在某种线性关系或高度相关的时候,就会发生多重共线性问题。 多重共线性问题的两种情形都会给回归模型的参数估计带来困难。完全多重共线性问题指矩阵 X’X 严格不可逆的情形,造成了模型的不可估计;近似多重共线性问题即数据矩阵中的一个或几个列向量可以近似地表达为其他列向量的线性组合的情形,会导致参数估计值的标准误过大。 【判断】一个判断是否存在严重近似共线性问题的经验性原则是:(1)自变量中最大的方差膨胀因子 VIF 大于 10;(2)平均方差膨胀因子 VIF 明显大于 1。 【解决】当多重共线性问题发生时,我们需要对其进行处理才能保证模型本身的有效性。 如果发生的是完全多重共线性问题,则直接删除在数据中不必要的变量即可。这些变量可能是虚拟变量中的参照组,也可是包含了某些变量或其线性组合而生成的新变量。只要保证删除变量后无完全多重共线性问题即可。 如果发生的是近似多重共线性问题,就没有特别简单的方法来解决。如果在理论上我们可以识别某些自变量,即自变量在理论上都是有意义且意义不重复或每个自变量都不可以被其他自变量线性解释,那么当在实际中出现近似多重共线性问题时,我们可以通过增大样本量来解决多重共线性问题。 但是当没有明确的理论,不能在理论上识别某些自变量的时候,可以利用一些技术上的处理方法来减少自变量的数目。比较典型的方法是把彼此之间存在一定相关性的变量综合成较少的几个变量。这种综合变量信息的方法包括偏最小二乘回归分析、主成分分析法以及由主成分分析法推广得到的因子分析。 【参考】谢宇《回归分析》 @reynd","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"线性回归","slug":"线性回归","permalink":"http://yoursite.com/tags/%E7%BA%BF%E6%80%A7%E5%9B%9E%E5%BD%92/"},{"name":"多重共线性","slug":"多重共线性","permalink":"http://yoursite.com/tags/%E5%A4%9A%E9%87%8D%E5%85%B1%E7%BA%BF%E6%80%A7/"},{"name":"多元线性回归","slug":"多元线性回归","permalink":"http://yoursite.com/tags/%E5%A4%9A%E5%85%83%E7%BA%BF%E6%80%A7%E5%9B%9E%E5%BD%92/"},{"name":"自变量","slug":"自变量","permalink":"http://yoursite.com/tags/%E8%87%AA%E5%8F%98%E9%87%8F/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"},{"name":"应用","slug":"理论/应用","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/%E5%BA%94%E7%94%A8/"}]},{"title":"随机森林泛化能力","date":"2020-01-15T16:00:00.000Z","path":"wiki/随机森林泛化能力/","text":"问题为什么说随机森林比决策树拥有更好的泛化能力? 回答随机森林拥有投票机制,减少单棵树出现的偏向问题决策树单一决策 容易过拟合 @wweiyan","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"机器学习","slug":"机器学习","permalink":"http://yoursite.com/tags/%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0/"},{"name":"随机森林","slug":"随机森林","permalink":"http://yoursite.com/tags/%E9%9A%8F%E6%9C%BA%E6%A3%AE%E6%9E%97/"},{"name":"决策树","slug":"决策树","permalink":"http://yoursite.com/tags/%E5%86%B3%E7%AD%96%E6%A0%91/"},{"name":"泛化能力","slug":"泛化能力","permalink":"http://yoursite.com/tags/%E6%B3%9B%E5%8C%96%E8%83%BD%E5%8A%9B/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"},{"name":"应用","slug":"理论/应用","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/%E5%BA%94%E7%94%A8/"}]},{"title":"选择与切片","date":"2019-09-02T16:00:00.000Z","path":"wiki/选择与切片/","text":"问题在不同的数据分析软件/语言中,选(select)取或切片(slice)有什么技巧?请畅所欲言~ 回答暂无,前往 Telegram 群 参与答题。不积跬步无以至千里。","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"语言","slug":"语言","permalink":"http://yoursite.com/tags/%E8%AF%AD%E8%A8%80/"},{"name":"选择","slug":"选择","permalink":"http://yoursite.com/tags/%E9%80%89%E6%8B%A9/"},{"name":"切片","slug":"切片","permalink":"http://yoursite.com/tags/%E5%88%87%E7%89%87/"},{"name":"R","slug":"R","permalink":"http://yoursite.com/tags/R/"},{"name":"Python","slug":"Python","permalink":"http://yoursite.com/tags/Python/"}],"categories":[{"name":"语言","slug":"语言","permalink":"http://yoursite.com/categories/%E8%AF%AD%E8%A8%80/"}]},{"title":"判断变形词","date":"2019-08-29T16:00:00.000Z","path":"wiki/判断变形词/","text":"问题给定两个字符串 str1 和 str2,如果 str1 和 str2 中出现的字符种类一样且每种字符出现的次数也一样,那么 str1 与 str2 互为变形词。请实现函数判断两个字符串是否互为变形词。 【举例】str1="123",str2="231",返回 true。str1="123",str2="2331",返回 false。 回答 @reynd Python3 12345678910111213def isAnagram(self, s: str, t: str) -> bool: if s ==\"\" and t == \"\": return True if len(s) != len(t) or len(set(s)) != len(set(t)): return False for char in set(s): s_count = s.count(char) t_count = t.count(char) if s_count != t_count: return False return True 主要利用了 set 的内置优化和特性减少运算,以及使用布尔运算提高效率。 执行用时 : 48 ms, 在所有 Python3 提交中击败了 98.93% 的用户内存消耗 : 14 MB, 在所有 Python3 提交中击败了 29.70% 的用户 @bob C++ 如果字符串 str1 和 str2 长度不同,直接返回 false。 如果长度相同,假设出现字符的编码值在 0~255 之间,那么先申请一个长度为 256 的整型数组 map,map[a]=b 代表字符编码为 a 的字符出现了 b 次,初始时 map[0-255]的值都是 0。 然后遍历字符串 str1, 统计每种字符出现的数量,比如遍历到字符 ‘a’,其编码值为 97,则令 map[97]++。这样 map 就成了 str1 中每种字符的词频统计表。 然后遍历字符串 str2,每遍历到一个字符,都在 map 中把词频减下来,比如遍历到字符 ‘a’,其编码值为 97,则令 map[97]--,如果减少之后的值小于 0,直接返回 false。 如果遍历完 str2,map 中的值也没出现负值,则返回 true。","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"算法","slug":"算法","permalink":"http://yoursite.com/tags/%E7%AE%97%E6%B3%95/"},{"name":"变形词","slug":"变形词","permalink":"http://yoursite.com/tags/%E5%8F%98%E5%BD%A2%E8%AF%8D/"}],"categories":[{"name":"算法","slug":"算法","permalink":"http://yoursite.com/categories/%E7%AE%97%E6%B3%95/"}]},{"title":"日活与月活","date":"2019-08-28T16:00:00.000Z","path":"wiki/日活与月活/","text":"问题运营中,日活与月活的比值变化,说明了什么? 回答作者:Aaron 余乐链接:https://www.zhihu.com/question/24007425/answer/130382392来源:知乎著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。 DAU(日活):单日活跃用户量,反应产品短期用户活跃度 MAU(月活):单月活跃用户量,反应产品长期用户活跃度日活和月活的比值乘以 30 等于用户月平均登录天数。 日活和月活的比值高,代表一个月有使用产品的用户中,每天都使用产品的用户比例高,即使用频率高,用户对产品的依赖性强,同时也说明用户粘度较强。另一方面,也代表了用户的流失率低,留存率高。 日活和月活的比值低,所有结论相反,用户使用频率低,依赖性弱,粘度较弱,用户流失率高,留存率低。 下面分 4 种动态情况来讨论日活和月活的关系 日活和月活的比值变高,日活增加显著。说明产品的改动或者啥大新闻让部分沉默用户苏醒,但是这个改动和大新闻大多触及到产品已有老用户,这种情况下我们应该加大对产品新功能的推广和宣传,引导更多的新用户成为我们的活跃用户。 日活和月活的比值变高,月活减少显著。说明非忠实用户的流失变得严重,对于一部分刚需的用户我们可以保留下来,但是对于不是刚需的用户我们无法挽留,这种情况下我们应该在保证核心功能的基础上进行功能多元化的探索,满足更多非刚需非忠实用户的需求。 日活和月活的比值变低,日活减少显著。说明我们的核心功能出现了问题或者外界的影响造成了用户对产品本身的恐慌,像什么某某某 P2P 跑路了等等,导致原本使用产品的用户跳到竞品或者不再使用,这种情况下我们应该分析竞品的动向,确保我们的核心功能各方面使用体验达到最优,才能挽留用户。 日活和月活的比值变低,月活增加显著。说明产品的改动或者外界的推广让短期用户活跃度提升,但是这个改动和影响不具备可持续性,可能是用户玩一下就腻了的快死型功能,比如脸萌,足迹这种功能,这种情况下我们要思考如何增加用户的粘度,减少用户流失。","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"应用","slug":"应用","permalink":"http://yoursite.com/tags/%E5%BA%94%E7%94%A8/"},{"name":"日活","slug":"日活","permalink":"http://yoursite.com/tags/%E6%97%A5%E6%B4%BB/"},{"name":"月活","slug":"月活","permalink":"http://yoursite.com/tags/%E6%9C%88%E6%B4%BB/"}],"categories":[{"name":"应用","slug":"应用","permalink":"http://yoursite.com/categories/%E5%BA%94%E7%94%A8/"}]},{"title":"数据分析资料推荐","date":"2019-08-20T16:00:00.000Z","path":"wiki/数据分析资料推荐/","text":"问题对数据分析相关技能,你最推荐的学习材料是?请注明技能、推荐阶段和推荐理由。如: 回答《利用 Python 进行数据分析》 @reynd技能: Python 数据分析阶段: 入门理由: 作者经验丰富行文简洁有条理,看一遍就能上手主流的数据分析工具。 在线阅读: https://wizardforcel.gitbooks.io/pyda-2e/content/","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"资料","slug":"资料","permalink":"http://yoursite.com/tags/%E8%B5%84%E6%96%99/"},{"name":"学习","slug":"学习","permalink":"http://yoursite.com/tags/%E5%AD%A6%E4%B9%A0/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"}]},{"title":"人数估算","date":"2019-08-19T16:00:00.000Z","path":"wiki/人数估算/","text":"问题如何估算香港一次游行/集会的人数? 回答抽样 @mbpRetina 抽样吧 静态和动态都可以 然后再调权重","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"应用","slug":"应用","permalink":"http://yoursite.com/tags/%E5%BA%94%E7%94%A8/"},{"name":"估算","slug":"估算","permalink":"http://yoursite.com/tags/%E4%BC%B0%E7%AE%97/"}],"categories":[{"name":"应用","slug":"应用","permalink":"http://yoursite.com/categories/%E5%BA%94%E7%94%A8/"}]},{"title":"约瑟夫环问题","date":"2019-08-18T16:00:00.000Z","path":"wiki/约瑟夫环问题/","text":"问题有 n 个人围成一圈,顺序排号。从第一个人开始报数(从 1 到 3 报数),凡报到 3 的人退出圈子,问最后留下的是原来第几号的那位。 跟群名似乎无关,但是感觉挺好玩 🤪 回答C 语言模拟 @capwill2 1234567891011121314151617181920212223242526272829303132#include <iostream>using namespace std;#define MAX 100000int num[MAX];int main(){ int n,p; cin>>n>>p; int i=1,current_step=1,killed=0; while(killed!=n-1){ if(current_step==p){ num[i]=1; killed++; cout<<\"kill \"<<i<<endl; current_step=0; } i++; if(i>n) i=1; if(num[i]==0) current_step++; } for(int i=1;i<=n;i++) if(num[i]==0) cout<<\"survivor: \"<<i; return 0;} 思路是用数组模拟循环链表。","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"算法","slug":"算法","permalink":"http://yoursite.com/tags/%E7%AE%97%E6%B3%95/"},{"name":"约瑟夫环","slug":"约瑟夫环","permalink":"http://yoursite.com/tags/%E7%BA%A6%E7%91%9F%E5%A4%AB%E7%8E%AF/"}],"categories":[{"name":"算法","slug":"算法","permalink":"http://yoursite.com/categories/%E7%AE%97%E6%B3%95/"}]},{"title":"无监督学习","date":"2019-08-17T16:00:00.000Z","path":"wiki/无监督学习/","text":"问题Suppose we a data set where each data point represents a single student’s scores on a math test, a physics test, a reading comprehension test, and a vocabulary test. We find the first two principal components, which capture 90% of the variability in the data, and interpret their loadings. We conclude that the first principal component represents overall academic ability, and the second represents a contrast between quantitative ability and verbal ability. What loadings would be consistent with that interpretation? Choose all that apply. a. (0.5, 0.5, 0.5, 0.5) and (0.71, 0.71, 0, 0) b. (0.5, 0.5, 0.5, 0.5) and (0, 0, -0.71, -0.71) c. (0.5, 0.5, 0.5, 0.5) and (0.5, 0.5, -0.5, -0.5) d. (0.5, 0.5, 0.5, 0.5) and (-0.5, -0.5, 0.5, 0.5) f. (0.71, 0.71, 0, 0) and (0, 0, 0.71, 0.71) g. (0.71, 0, -0.71, 0) and (0, 0.71, 0, -0.71) 答案@bob The answers are (c) and (d).","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"无监督学习","slug":"无监督学习","permalink":"http://yoursite.com/tags/%E6%97%A0%E7%9B%91%E7%9D%A3%E5%AD%A6%E4%B9%A0/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"}]},{"title":"排序算法","date":"2019-08-16T16:00:00.000Z","path":"wiki/排序算法/","text":"问题常见的排序算法有哪些?各语言如何实现? 这个问题有点大。因为排序算法有好多,举一个例子就成。而且 R 与 Python 是不是不用考虑排序算法? 回答三类排序 @capwill2 插入排序有直接插入排序、希尔排序; 交换排序有冒泡排序、快速排序; 选择排序有简单选择排序、堆排序。 举个 c++ 最简单的堆排序实现,用 stl,优先级队列即可。","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"算法","slug":"算法","permalink":"http://yoursite.com/tags/%E7%AE%97%E6%B3%95/"},{"name":"排序","slug":"排序","permalink":"http://yoursite.com/tags/%E6%8E%92%E5%BA%8F/"}],"categories":[{"name":"算法","slug":"算法","permalink":"http://yoursite.com/categories/%E7%AE%97%E6%B3%95/"}]},{"title":"SQL 通配符","date":"2019-08-15T16:00:00.000Z","path":"wiki/SQL 通配符/","text":"问题SQL 语言允许使用通配符进行字符串匹配的操作,其中‘%’可以表示A.零个字符B.1 个字符C.多个字符D.以上都是 回答@bob 替代 0 个或多个字符,选择 D。","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"数据库","slug":"数据库","permalink":"http://yoursite.com/tags/%E6%95%B0%E6%8D%AE%E5%BA%93/"},{"name":"SQL","slug":"SQL","permalink":"http://yoursite.com/tags/SQL/"}],"categories":[{"name":"应用","slug":"应用","permalink":"http://yoursite.com/categories/%E5%BA%94%E7%94%A8/"}]},{"title":"t 检验","date":"2019-08-13T16:00:00.000Z","path":"wiki/t 检验/","text":"问题t 检验是否要样本数据呈正态分布? 回答抽样分布符合正态就可以用。 @mbpRetina","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"统计","slug":"统计","permalink":"http://yoursite.com/tags/%E7%BB%9F%E8%AE%A1/"},{"name":"t检验","slug":"t检验","permalink":"http://yoursite.com/tags/t%E6%A3%80%E9%AA%8C/"},{"name":"正态","slug":"正态","permalink":"http://yoursite.com/tags/%E6%AD%A3%E6%80%81/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"}]},{"title":"监督学习","date":"2019-08-11T16:00:00.000Z","path":"wiki/监督学习/","text":"问题什么是监督学习? 回答有一组已知类别/分类的数据作为样本来训练的模型。 例:knn、贝叶斯分类、回归。 @mbpRetina","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"监督学习","slug":"监督学习","permalink":"http://yoursite.com/tags/%E7%9B%91%E7%9D%A3%E5%AD%A6%E4%B9%A0/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"}]},{"title":"列表排序、去重","date":"2019-08-09T16:00:00.000Z","path":"wiki/列表排序、去重/","text":"问题请用自己的算法, 按升序合并如下两个 list, 并去除重复的元素: list1 = [2, 3, 8, 4, 9, 5, 6]list2 = [5, 6, 10, 17, 11, 2] 回答 @bob 用 C++ 写了一下 123456789101112131415161718192021222324252627282930313233343536373839#include <iostream>#include <algorithm>using namespace std;int main(){ int list1[] = {2,3,8,4,9,5,6}; int list2[] = {5,6,10,17,11,2}; int length1 = sizeof(list1) / sizeof(list1[1]); int length2 = sizeof(list2) / sizeof(list2[1]); sort(list1,list1+length1); sort(list2,list2+length2); int list[length1+length2],index=0,i=0,j=0; while(i<length1 || j<length2){ if(list1[i]==list2[j]){ list[index++]=list1[i++]; j++; }else if(list1[i]>list2[j]){ list[index++]=list2[j++]; }else{ list[index++] = list1[i++]; } } while(i<length1){ list[index++] = list1[i++]; } while(j<length2){ list[index++] = list2[j++]; } for(int i=0;i<index;i++) cout<<list[i]<<\" \"; return 0;} @capwill2 我也刚刚用 C++写的 40 行。 我的思路是先全部排序好,最后做一次归并。复杂度为排序的复杂度,用的 stl 的 sort,O(nlogn)。其实可以直接考虑使用 set 容器,直接去重并排序。","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"算法","slug":"算法","permalink":"http://yoursite.com/tags/%E7%AE%97%E6%B3%95/"},{"name":"列表","slug":"列表","permalink":"http://yoursite.com/tags/%E5%88%97%E8%A1%A8/"},{"name":"排序","slug":"排序","permalink":"http://yoursite.com/tags/%E6%8E%92%E5%BA%8F/"},{"name":"去重","slug":"去重","permalink":"http://yoursite.com/tags/%E5%8E%BB%E9%87%8D/"}],"categories":[{"name":"算法","slug":"算法","permalink":"http://yoursite.com/categories/%E7%AE%97%E6%B3%95/"}]},{"title":"线性混合效应模型","date":"2019-08-08T16:00:00.000Z","path":"wiki/线性混合效应模型/","text":"问题线性混合效应模型,公式 Y ~ 1 + A + B + (1 | C:D:E),数字 1 和符号 | 各代表什么含义?这种公式怎么理解呢? 回答暂无,前往 Telegram 群 参与答题。不积跬步无以至千里。","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"回归","slug":"回归","permalink":"http://yoursite.com/tags/%E5%9B%9E%E5%BD%92/"},{"name":"线性模型","slug":"线性模型","permalink":"http://yoursite.com/tags/%E7%BA%BF%E6%80%A7%E6%A8%A1%E5%9E%8B/"},{"name":"混合效应","slug":"混合效应","permalink":"http://yoursite.com/tags/%E6%B7%B7%E5%90%88%E6%95%88%E5%BA%94/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"}]},{"title":"聚类分析","date":"2019-08-07T16:00:00.000Z","path":"wiki/聚类分析/","text":"问题聚类分析有几种方式,分别在什么情况下使用? 回答聚类一般分为划分法和层次法吧。划分法就是先指定每个类的中心,然后通过计算每个观测和类中心的距离,再调整类中心 ;划分法比较有名的方法就是 K-means;划分法适合于观测数比较多的情况,但需要提前输入类别数 K。层次法思想就是把所有观测各归为一类,然后每次把两个最相似的类合并为一个新类,直到达到某种条件为止。或者所有观测归为一类,每次把一个类分割成两个,直到达到某种条件位置;层次法下面有很多种方法;一般适合观测数比较少的情况,并且不需要输入类别数。 @yzxsiw","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"分类","slug":"分类","permalink":"http://yoursite.com/tags/%E5%88%86%E7%B1%BB/"},{"name":"聚类","slug":"聚类","permalink":"http://yoursite.com/tags/%E8%81%9A%E7%B1%BB/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"}]},{"title":"信度与效度的关系","date":"2019-08-06T16:00:00.000Z","path":"wiki/信度与效度的关系/","text":"问题信度与效度的关系是什么? 回答 信度高是效度高的必要不充分条件。@bob 2) 在社会科学中, 信度(reliability) 是指使用相同的研究技术重复测量同一个对象时得到相同研究结果的可能性;而 效度(validity) 是指实证测量在多大程度上反映了概念的真实含义。信度和效度的同时完善是研究者的追求,但实际操作中往往会遇到互斥的困境,不得不进行取舍。@reynd","tags":[{"name":"每日一题","slug":"每日一题","permalink":"http://yoursite.com/tags/%E6%AF%8F%E6%97%A5%E4%B8%80%E9%A2%98/"},{"name":"测量","slug":"测量","permalink":"http://yoursite.com/tags/%E6%B5%8B%E9%87%8F/"},{"name":"信度","slug":"信度","permalink":"http://yoursite.com/tags/%E4%BF%A1%E5%BA%A6/"},{"name":"效度","slug":"效度","permalink":"http://yoursite.com/tags/%E6%95%88%E5%BA%A6/"}],"categories":[{"name":"理论","slug":"理论","permalink":"http://yoursite.com/categories/%E7%90%86%E8%AE%BA/"}]}]}