标签归档:深度学习

Deep Learning Specialization on Coursera

Coursera上数据科学相关课程(公开课)汇总推荐

Coursera上的数据科学课程有很多,这里汇总一批。

1、 Introduction to Data Science Specialization

IBM公司推出的数据科学导论专项课程系列(Introduction to Data Science Specialization),这个系列包括4门子课程,涵盖数据科学简介,面向数据科学的开源工具,数据科学方法论,SQL基础,感兴趣的同学可以关注:Launch your career in Data Science。Data Science skills to prepare for a career or further advanced learning in Data Science.

1) What is Data Science?
2) Open Source tools for Data Science
3) Data Science Methodology
4) Databases and SQL for Data Science

2、Applied Data Science Specialization

IBM公司推出的 应用数据科学专项课程系列(Applied Data Science Specialization),这个系列包括4门子课程,涵盖面向数据科学的Python,Python数据可视化,Python数据分析,数据科学应用毕业项目,感兴趣的同学可以关注:Get hands-on skills for a Career in Data Science。Learn Python, analyze and visualize data. Apply your skills to data science and machine learning.

1) Python for Data Science
2) Data Visualization with Python
3) Data Analysis with Python
4) Applied Data Science Capstone

3、Applied Data Science with Python Specialization

密歇根大学的Python数据科学应用专项课程系列(Applied Data Science with Python),这个系列的目标主要是通过Python编程语言介绍数据科学的相关领域,包括应用统计学,机器学习,信息可视化,文本分析和社交网络分析等知识,并结合一些流行的Python工具包进行讲授,例如pandas, matplotlib, scikit-learn, nltk以及networkx等Python工具。感兴趣的同学可以关注:Gain new insights into your data-Learn to apply data science methods and techniques, and acquire analysis skills.

1) Introduction to Data Science in Python
2) Applied Plotting, Charting & Data Representation in Python
3) Applied Machine Learning in Python
4) Applied Text Mining in Python
5) Applied Social Network Analysis in Python

4、Data Science Specialization

约翰霍普金斯大学的数据科学专项课程系列(Data Science Specialization),这个系列课程有10门子课程,包括数据科学家的工具箱,R语言编程,数据清洗和获取,数据分析初探,可重复研究,统计推断,回归模型,机器学习实践,数据产品开发,数据科学毕业项目,感兴趣的同学可以关注: Launch Your Career in Data Science-A nine-course introduction to data science, developed and taught by leading professors.

1) The Data Scientist’s Toolbox
2) R Programming
3) Getting and Cleaning Data
4) Exploratory Data Analysis
5) Reproducible Research
6) Statistical Inference
7) Regression Models
8) Practical Machine Learning
9) Developing Data Products
10) Data Science Capstone

5、Data Science at Scale Specialization

华盛顿大学的大规模数据科学专项课程系列(Data Science at Scale ),这个系列包括3门子课程和1个毕业项目课程,包括大规模数据系统和算法,数据分析模型与方法,数据科学结果分析等,感兴趣的同学可以关注: Tackle Real Data Challenges-Master computational, statistical, and informational data science in three courses.

1) Data Manipulation at Scale: Systems and Algorithms
2) Practical Predictive Analytics: Models and Methods
3) Communicating Data Science Results
4) Data Science at Scale – Capstone Project

6、Advanced Data Science with IBM Specialization

IBM公司推出的高级数据科学专项课程系列(Advanced Data Science with IBM Specialization),这个系列包括4门子课程,涵盖数据科学基础,高级机器学习和信号处理,结合深度学习的人工智能应用等,感兴趣的同学可以关注:Expert in DataScience, Machine Learning and AI。Become an IBM-approved Expert in Data Science, Machine Learning and Artificial Intelligence.

1) Fundamentals of Scalable Data Science
2) Advanced Machine Learning and Signal Processing
3) Applied AI with DeepLearning
4) Advanced Data Science Capstone

7、Data Mining Specialization

伊利诺伊大学香槟分校的数据挖掘专项课程系列(Data Mining Specialization),这个系列包含5门子课程和1个毕业项目课程,涵盖数据可视化,信息检索,文本挖掘与分析,模式发现和聚类分析等,感兴趣的同学可以关注:Data Mining Specialization-Analyze Text, Discover Patterns, Visualize Data. Solve real-world data mining challenges.

1) Data Visualization
2) Text Retrieval and Search Engines
3) Text Mining and Analytics
4) Pattern Discovery in Data Mining
5) Cluster Analysis in Data Mining
6) Data Mining Project

8、Data Analysis and Interpretation Specialization

数据分析和解读专项课程系列(Data Analysis and Interpretation Specialization),该系列包括5门子课程,分别是数据管理和可视化,数据分析工具,回归模型,机器学习,毕业项目,感兴趣的同学可以关注:Learn Data Science Fundamentals-Drive real world impact with a four-course introduction to data science.

1) Data Management and Visualization
2) Data Analysis Tools
3) Regression Modeling in Practice
4) Machine Learning for Data Analysis
5) Data Analysis and Interpretation Capstone

9、Executive Data Science Specialization

可管理的数据科学专项课程系列(Executive Data Science Specialization),这个系列包含4门子课程和1门毕业项目课程,涵盖数据科学速成,数据科学小组建设,数据分析管理,现实生活中的数据科学等,感兴趣的同学可以关注:Be The Leader Your Data Team Needs-Learn to lead a data science team that generates first-rate analyses in four courses.

1)A Crash Course in Data Science
2)Building a Data Science Team
3)Managing Data Analysis
4)Data Science in Real Life
5)Executive Data Science Capstone

10、其他相关的数据科学课程

1) Data Science Math Skills
2) Data Science Ethics
3) How to Win a Data Science Competition: Learn from Top Kagglers

注:原创文章,转载请注明出处“课程图谱博客”:http://blog.coursegraph.com

本文链接地址:http://blog.coursegraph.com/coursera上数据科学相关课程数据科学公开课汇总推荐 http://blog.coursegraph.com/?p=851

Coursera上机器学习课程(公开课)汇总推荐

Coursera上有很多机器学习课程,这里做个总结,因为机器学习相关的概念和应用很多,这里推荐的课程仅限于和机器学习直接相关的课程,虽然深度学习属于机器学习范畴,这里暂时也将其排除在外,后续会专门推出深度学习课程的系列推荐。

1. Andrew Ng 老师的 机器学习课程(Machine Learning)

机器学习入门首选课程,没有之一。这门课程从一开始诞生就备受瞩目,据说全世界有数百万人通过这门课程入门机器学习。课程的级别是入门级别的,对学习者的背景要求不高,Andrew Ng 老师讲解的又很通俗易懂,所以强烈推荐从这门课程开始走入机器学习。课程简介:

机器学习是一门研究在非特定编程条件下让计算机采取行动的学科。最近二十年,机器学习为我们带来了自动驾驶汽车、实用的语音识别、高效的网络搜索,让我们对人类基因的解读能力大大提高。当今机器学习技术已经非常普遍,您很可能在毫无察觉情况下每天使用几十次。许多研究者还认为机器学习是人工智能(AI)取得进展的最有效途径。在本课程中,您将学习最高效的机器学习技术,了解如何使用这些技术,并自己动手实践这些技术。更重要的是,您将不仅将学习理论知识,还将学习如何实践,如何快速使用强大的技术来解决新问题。最后,您将了解在硅谷企业如何在机器学习和AI领域进行创新。 本课程将广泛介绍机器学习、数据挖掘和统计模式识别。相关主题包括:(i) 监督式学习(参数和非参数算法、支持向量机、核函数和神经网络)。(ii) 无监督学习(集群、降维、推荐系统和深度学习)。(iii) 机器学习实例(偏见/方差理论;机器学习和AI领域的创新)。课程将引用很多案例和应用,您还需要学习如何在不同领域应用学习算法,例如智能机器人(感知和控制)、文本理解(网络搜索和垃圾邮件过滤)、计算机视觉、医学信息学、音频、数据库挖掘等领域。

这里有老版课程评论,非常值得参考推荐:Machine Learning

2. 台湾大学林轩田老师的 機器學習基石上 (Machine Learning Foundations)—Mathematical Foundations

如果有一定的基础或者学完了Andrew Ng老师的机器学习课程,这门机器学习基石上-数学基础可以作为进阶课程。林老师早期推出的两门机器学习课程口碑和难度均有:机器学习基石机器学习技法 ,现在重组为上和下,非常值得期待:

Machine learning is the study that allows computers to adaptively improve their performance with experience accumulated from the data observed. Our two sister courses teach the most fundamental algorithmic, theoretical and practical tools that any user of machine learning needs to know. This first course of the two would focus more on mathematical tools, and the other course would focus more on algorithmic tools. [機器學習旨在讓電腦能由資料中累積的經驗來自我進步。我們的兩項姊妹課程將介紹各領域中的機器學習使用者都應該知道的基礎演算法、理論及實務工具。本課程將較為著重數學類的工具,而另一課程將較為著重方法類的工具。]

3. 台湾大学林轩田老师的 機器學習基石下 (Machine Learning Foundations)—Algorithmic Foundations

作为2的姊妹篇,这个机器学习基石下-算法基础 更注重机器学习算法相关知识:

Machine learning is the study that allows computers to adaptively improve their performance with experience accumulated from the data observed. Our two sister courses teach the most fundamental algorithmic, theoretical and practical tools that any user of machine learning needs to know. This second course of the two would focus more on algorithmic tools, and the other course would focus more on mathematical tools. [機器學習旨在讓電腦能由資料中累積的經驗來自我進步。我們的兩項姊妹課程將介紹各領域中的機器學習使用者都應該知道的基礎演算法、理論及實務工具。本課程將較為著重方法類的工具,而另一課程將較為著重數學類的工具。

可参考早期的老版本课程评论:機器學習基石 (Machine Learning Foundations) 機器學習技法 (Machine Learning Techniques)

4. 华盛顿大学的 “机器学习专项课程(Machine Learning Specialization)

这个系列课程包含4门子课程,分别是 机器学习基础:案例研究 , 机器学习:回归 , 机器学习:分类, 机器学习:聚类与检索:

This Specialization from leading researchers at the University of Washington introduces you to the exciting, high-demand field of Machine Learning. Through a series of practical case studies, you will gain applied experience in major areas of Machine Learning including Prediction, Classification, Clustering, and Information Retrieval. You will learn to analyze large and complex datasets, create systems that adapt and improve over time, and build intelligent applications that can make predictions from data.

4.1 Machine Learning Foundations: A Case Study Approach(机器学习基础: 案例研究)

你是否好奇数据可以告诉你什么?你是否想在关于机器学习促进商业的核心方式上有深层次的理解?你是否想能同专家们讨论关于回归,分类,深度学习以及推荐系统的一切?在这门课上,你将会通过一系列实际案例学习来获取实践经历。

Do you have data and wonder what it can tell you? Do you need a deeper understanding of the core ways in which machine learning can improve your business? Do you want to be able to converse with specialists about anything from regression and classification to deep learning and recommender systems? In this course, you will get hands-on experience with machine learning from a series of practical case-studies. At the end of the first course you will have studied how to predict house prices based on house-level features, analyze sentiment from user reviews, retrieve documents of interest, recommend products, and search for images. Through hands-on practice with these use cases, you will be able to apply machine learning methods in a wide range of domains. This first course treats the machine learning method as a black box. Using this abstraction, you will focus on understanding tasks of interest, matching these tasks to machine learning tools, and assessing the quality of the output. In subsequent courses, you will delve into the components of this black box by examining models and algorithms. Together, these pieces form the machine learning pipeline, which you will use in developing intelligent applications. Learning Outcomes: By the end of this course, you will be able to: -Identify potential applications of machine learning in practice. -Describe the core differences in analyses enabled by regression, classification, and clustering. -Select the appropriate machine learning task for a potential application. -Apply regression, classification, clustering, retrieval, recommender systems, and deep learning. -Represent your data as features to serve as input to machine learning models. -Assess the model quality in terms of relevant error metrics for each task. -Utilize a dataset to fit a model to analyze new data. -Build an end-to-end application that uses machine learning at its core. -Implement these techniques in Python.

4.2 Machine Learning: Regression(机器学习: 回归问题)

这门课程关注机器学习里面的一个基本问题: 回归(Regression), 也通过案例研究(预测房价)的方式进行回归问题的学习,最终通过Python实现相关的机器学习算法。

Case Study – Predicting Housing Prices In our first case study, predicting house prices, you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,…). This is just one of the many places where regression can be applied. Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression. In this course, you will explore regularized linear regression models for the task of prediction and feature selection. You will be able to handle very large sets of features and select between models of various complexity. You will also analyze the impact of aspects of your data — such as outliers — on your selected models and predictions. To fit these models, you will implement optimization algorithms that scale to large datasets. Learning Outcomes: By the end of this course, you will be able to: -Describe the input and output of a regression model. -Compare and contrast bias and variance when modeling data. -Estimate model parameters using optimization algorithms. -Tune parameters with cross validation. -Analyze the performance of the model. -Describe the notion of sparsity and how LASSO leads to sparse solutions. -Deploy methods to select between models. -Exploit the model to form predictions. -Build a regression model to predict prices using a housing dataset. -Implement these techniques in Python.

4.3 Machine Learning: Classification(机器学习:分类问题)

这门课程关注机器学习里面的另一个基本问题: 分类(Classification), 通过两个案例研究进行学习:情感分析和贷款违约预测,最终通过Python实现相关的算法(也可以选择其他语言,但是强烈推荐Python)。

Case Studies: Analyzing Sentiment & Loan Default Prediction In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,…). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank. These tasks are an examples of classification, one of the most widely used areas of machine learning, with a broad array of applications, including ad targeting, spam detection, medical diagnosis and image classification. In this course, you will create classifiers that provide state-of-the-art performance on a variety of tasks. You will become familiar with the most successful techniques, which are most widely used in practice, including logistic regression, decision trees and boosting. In addition, you will be able to design and implement the underlying algorithms that can learn these models at scale, using stochastic gradient ascent. You will implement these technique on real-world, large-scale machine learning tasks. You will also address significant tasks you will face in real-world applications of ML, including handling missing data and measuring precision and recall to evaluate a classifier. This course is hands-on, action-packed, and full of visualizations and illustrations of how these techniques will behave on real data. We’ve also included optional content in every module, covering advanced topics for those who want to go even deeper! Learning Objectives: By the end of this course, you will be able to: -Describe the input and output of a classification model. -Tackle both binary and multiclass classification problems. -Implement a logistic regression model for large-scale classification. -Create a non-linear model using decision trees. -Improve the performance of any model using boosting. -Scale your methods with stochastic gradient ascent. -Describe the underlying decision boundaries. -Build a classification model to predict sentiment in a product review dataset. -Analyze financial data to predict loan defaults. -Use techniques for handling missing data. -Evaluate your models using precision-recall metrics. -Implement these techniques in Python (or in the language of your choice, though Python is highly recommended).

4.4 Machine Learning: Clustering & Retrieval(机器学习:聚类和检索)

这门课程关注的是机器学习里面的另外两个基本问题:聚类和检索,同样通过案例研究进行学习:相似文档查询,一个非常具有实际应用价值的问题:

Case Studies: Finding Similar Documents A reader is interested in a specific news article and you want to find similar articles to recommend. What is the right notion of similarity? Moreover, what if there are millions of other documents? Each time you want to a retrieve a new document, do you need to search through all other documents? How do you group similar documents together? How do you discover new, emerging topics that the documents cover? In this third case study, finding similar documents, you will examine similarity-based algorithms for retrieval. In this course, you will also examine structured representations for describing the documents in the corpus, including clustering and mixed membership models, such as latent Dirichlet allocation (LDA). You will implement expectation maximization (EM) to learn the document clusterings, and see how to scale the methods using MapReduce. Learning Outcomes: By the end of this course, you will be able to: -Create a document retrieval system using k-nearest neighbors. -Identify various similarity metrics for text data. -Reduce computations in k-nearest neighbor search by using KD-trees. -Produce approximate nearest neighbors using locality sensitive hashing. -Compare and contrast supervised and unsupervised learning tasks. -Cluster documents by topic using k-means. -Describe how to parallelize k-means using MapReduce. -Examine probabilistic clustering approaches using mixtures models. -Fit a mixture of Gaussian model using expectation maximization (EM). -Perform mixed membership modeling using latent Dirichlet allocation (LDA). -Describe the steps of a Gibbs sampler and how to use its output to draw inferences. -Compare and contrast initialization techniques for non-convex optimization objectives. -Implement these techniques in Python.

5. 密歇根大学的 Applied Machine Learning in Python(在Python中应用机器学习)

Python机器学习应用课程,这门课程主要聚焦在通过Python应用机器学习,包括机器学习和统计学的区别,机器学习工具包scikit-learn的介绍,有监督学习和无监督学习,数据泛化问题(例如交叉验证和过拟合)等。这门课程同时属于”Python数据科学应用专项课程系列(Applied Data Science with Python Specialization)“。

This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. The course will start with a discussion of how machine learning is different than descriptive statistics, and introduce the scikit learn toolkit. The issue of dimensionality of data will be discussed, and the task of clustering data, as well as evaluating those clusters, will be tackled. Supervised approaches for creating predictive models will be described, and learners will be able to apply the scikit learn predictive modelling methods while understanding process issues related to data generalizability (e.g. cross validation, overfitting). The course will end with a look at more advanced techniques, such as building ensembles, and practical limitations of predictive models. By the end of this course, students will be able to identify the difference between a supervised (classification) and unsupervised (clustering) technique, identify which technique they need to apply for a particular dataset and need, engineer features to meet that need, and write python code to carry out an analysis. This course should be taken after Introduction to Data Science in Python and Applied Plotting, Charting & Data Representation in Python and before Applied Text Mining in Python and Applied Social Analysis in Python.

6. 俄罗斯国立高等经济学院和Yandex联合推出的 高级机器学习专项课程系列(Advanced Machine Learning Specialization)

该系列授课语言为英语,包括深度学习,Kaggle数据科学竞赛,机器学习中的贝叶斯方法,强化学习,计算机视觉,自然语言处理等7门子课程,截止目前前3门课程已开,感兴趣的同学可以关注:

This specialization gives an introduction to deep learning, reinforcement learning, natural language understanding, computer vision and Bayesian methods. Top Kaggle machine learning practitioners and CERN scientists will share their experience of solving real-world problems and help you to fill the gaps between theory and practice. Upon completion of 7 courses you will be able to apply modern machine learning methods in enterprise and understand the caveats of real-world data and settings.

以下是和机器学习直接相关的子课程,其他这里略过:

6.3 Bayesian Methods for Machine Learning(面向机器学习的贝叶斯方法)

该课程关注机器学习中的贝叶斯方法,贝叶斯方法在很多领域都很有用,例如游戏开发和毒品发现。它们给很多机器学习算法赋予了“超能力”,例如处理缺失数据,从小数据集中提取大量有用的信息等。当贝叶斯方法被应用在深度学习中时,它可以让你将模型压缩100倍,并且自动帮你调参,节省你的时间和金钱。

Bayesian methods are used in lots of fields: from game development to drug discovery. They give superpowers to many machine learning algorithms: handling missing data, extracting much more information from small datasets. Bayesian methods also allow us to estimate uncertainty in predictions, which is a really desirable feature for fields like medicine. When Bayesian methods are applied to deep learning, it turns out that they allow you to compress your models 100 folds, and automatically tune hyperparametrs, saving your time and money. In six weeks we will discuss the basics of Bayesian methods: from how to define a probabilistic model to how to make predictions from it. We will see how one can fully automate this workflow and how to speed it up using some advanced techniques. We will also see applications of Bayesian methods to deep learning and how to generate new images with it. We will see how new drugs that cure severe diseases be found with Bayesian methods.

7. 约翰霍普金斯大学的 Practical Machine Learning(机器学习实战)

这门课程从数据科学的角度来应用机器学习进修实战,课程将会介绍机器学习的基础概念譬如训练集,测试集,过拟合和错误率等,同时这门课程也会介绍机器学习的基本模型和算法,例如回归,分类,朴素贝叶斯,以及随机森林。这门课程最终会覆盖一个完整的机器学习实战周期,包括数据采集,特征生成,机器学习算法应用以及结果评估等。这门机器学习实践课程同时属于约翰霍普金斯大学的 数据科学专项课程(Data Science Specialization)系列:

One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates. The course will also introduce a range of model based and algorithmic machine learning methods including regression, classification trees, Naive Bayes, and random forests. The course will cover the complete process of building prediction functions including data collection, feature creation, algorithms, and evaluation.

8. 卫斯理大学 Regression Modeling in Practice(回归模型实战)

这门课程关注的是数据分析以及机器学习领域的最重要的一个概念和工具:回归(模型)分析。这门课程使用SAS或者Python,从线性回归开始学习,到了解整个回归模型,以及应用回归模型进行数据分析:

This course focuses on one of the most important tools in your data analysis arsenal: regression analysis. Using either SAS or Python, you will begin with linear regression and then learn how to adapt when two variables do not present a clear linear relationship. You will examine multiple predictors of your outcome and be able to identify confounding variables, which can tell a more compelling story about your results. You will learn the assumptions underlying regression analysis, how to interpret regression coefficients, and how to use regression diagnostic plots and other tools to evaluate the quality of your regression model. Throughout the course, you will share with others the regression models you have developed and the stories they tell you.

这门课程同时属于卫斯理大学的 数据分析与解读专项课程系列(Data Analysis and Interpretation Specialization)

9. 卫斯理大学的 Machine Learning for Data Analysis(面向数据分析的机器学习)

这门课程关注数据分析里的机器学习,机器学习的过程是一个开发、测试和应用预测算法来实现目标的过程,这门课程以 Regression Modeling in Practice(回归模型实战) 为基础,介绍机器学习中的有监督学习概念,同时从基础的分类算法到决策树以及聚类都会覆盖。通过完成这门课程,你将会学习如何应用、测试和解读机器学习算法用来解决实际问题。

Are you interested in predicting future outcomes using your data? This course helps you do just that! Machine learning is the process of developing, testing, and applying predictive algorithms to achieve this goal. Make sure to familiarize yourself with course 3 of this specialization before diving into these machine learning concepts. Building on Course 3, which introduces students to integral supervised machine learning concepts, this course will provide an overview of many additional concepts, techniques, and algorithms in machine learning, from basic classification to decision trees and clustering. By completing this course, you will learn how to apply, test, and interpret machine learning algorithms as alternative methods for addressing your research questions.

这门课程同时属于卫斯理大学的 数据分析与解读专项课程系列(Data Analysis and Interpretation Specialization)

10. 加州大学圣地亚哥分校的 Machine Learning With Big Data(大数据机器学习)

这门课程关注大数据中的机器学习技术,将会介绍相关的机器学习算法和工具。通过这门课程,你可以学到:通过机器学习过程来设计和利用数据;将机器学习技术用于探索和准备数据来建模;识别机器学习问题的类型;通过广泛可用的开源工具来使用数据构建模型;在Spark中使用大规模机器学习算法分析大数据。

Want to make sense of the volumes of data you have collected? Need to incorporate data-driven decisions into your process? This course provides an overview of machine learning techniques to explore, analyze, and leverage data. You will be introduced to tools and algorithms you can use to create machine learning models that learn from data, and to scale those models up to big data problems. At the end of the course, you will be able to: • Design an approach to leverage data using the steps in the machine learning process. • Apply machine learning techniques to explore and prepare data for modeling. • Identify the type of machine learning problem in order to apply the appropriate set of techniques. • Construct models that learn from data using widely available open source tools. • Analyze big data problems using scalable machine learning algorithms on Spark.

这门课程同时属于 加州大学圣地亚哥分校的大数据专项课程系列(Big Data Specialization)

11. 俄罗斯搜索巨头Yandex推出的 Big Data Applications: Machine Learning at Scale(大数据应用:大规模机器学习)

机器学习正在改变世界,通过这门课程,你将会学习到:识别实战中需要用机器学习算法解决的问题;通过Spark MLLib构建、调参、和应用线性模型;里面文本处理的方法;用决策树和Boost方法解决机器学习问题;构建自己的推荐系统。

Machine learning is transforming the world around us. To become successful, you’d better know what kinds of problems can be solved with machine learning, and how they can be solved. Don’t know where to start? The answer is one button away. During this course you will: – Identify practical problems which can be solved with machine learning – Build, tune and apply linear models with Spark MLLib – Understand methods of text processing – Fit decision trees and boost them with ensemble learning – Construct your own recommender system. As a practical assignment, you will – build and apply linear models for classification and regression tasks; – learn how to work with texts; – automatically construct decision trees and improve their performance with ensemble learning; – finally, you will build your own recommender system! With these skills, you will be able to tackle many practical machine learning tasks. We provide the tools, you choose the place of application to make this world of machines more intelligent.

这门课程同时属于Yandex推出的 面向数据工程师的大数据专项课程系列(Big Data for Data Engineers Specialization)

注:原创文章,转载请注明出处“课程图谱博客”:http://blog.coursegraph.com

本文链接地址:http://blog.coursegraph.com/coursera上机器学习课程公开课汇总推荐 http://blog.coursegraph.com/?p=696

深度学习课程资源整理

这里整理一批深度学习课程或者深度学习相关公开课的资源,持续更新,仅供参考。

1. Andrew Ng (吴恩达) 深度学习专项课程 by Coursera and deeplearning.ai

这是 Andrew Ng 老师离开百度后推出的第一个深度学习项目(deeplearning.ai)的一个课程: Deep Learning Specialization ,课程口号是:Master Deep Learning, and Break into AI. 作为 Coursera 联合创始人 和 机器学习网红课程Machine Learning” 的授课者,Andrew Ng 老师引领了数百万同学进入了机器学习领域,而这门深度学习课程的口号也透露了他的野心:继续带领百万人进入深度学习的圣地。

作为 Andrew Ng 老师的粉丝,依然推荐这门课程作为深度学习入门课程首选,并且建议花费上 Coursera 上的课程,一方面可以做题,另外还有证书,最重要的是它的编程作业,是理解课程内容的关键点,仅仅看视频绝对是达不到这个效果的。参考:《Andrew Ng 深度学习课程小记》和《Andrew Ng (吴恩达) 深度学习课程小结》。

2. Geoffrey Hinton 大神的 面向机器学习的神经网络(Neural Networks for Machine Learning)

Geoffrey Hinton大神的这门深度学习课程 2012年在 Coursera 上开过一轮,之后一直沉寂,直到 Coursera 新课程平台上线,这门课程已开过多轮次,来自课程图谱网友的评论:

“Deep learning必修课”

“宗派大师+开拓者直接讲课,秒杀一切二流子”

这门深度学习课程相对上面 Andrew Ng深度学习课程有一定难道,但是没有编程作业,只有Quiz.

3. 牛津大学深度学习课程(2015): Deep learning at Oxford 2015

这门深度学习课程名字虽然是 “Machine Learning 2014-2015″,不过主要聚焦在深度学习的内容上,可以作为一门很系统的机器学习深度学习课程:

Machine learning techniques enable us to automatically extract features from data so as to solve predictive tasks, such as speech recognition, object recognition, machine translation, question-answering, anomaly detection, medical diagnosis and prognosis, automatic algorithm configuration, personalisation, robot control, time series forecasting, and much more. Learning systems adapt so that they can solve new tasks, related to previously encountered tasks, more efficiently.

The course focuses on the exciting field of deep learning. By drawing inspiration from neuroscience and statistics, it introduces the basic background on neural networks, back propagation, Boltzmann machines, autoencoders, convolutional neural networks and recurrent neural networks. It illustrates how deep learning is impacting our understanding of intelligence and contributing to the practical design of intelligent machines.

视频Playlist:https://www.youtube.com/playlist?list=PLE6Wd9FR–EfW8dtjAuPoTuPcqmOV53Fu

参考:“牛津大学Nando de Freitas主讲的机器学习课程,重点介绍深度学习,还请来Deepmind的Alex Graves和Karol Gregor客座报告,内容、讲解都属一流,强烈推荐! 云: http://t.cn/RA2vSNX

4. Udacity 深度学习(中/英)by Google

Udacity (优达学城)上由Google工程师主讲的免费深度学习课程,结合Google自己的深度学习工具 Tensorflow ,很不错:

机器学习是发展最快、最令人兴奋的领域之一,而深度学习则代表了机器学习中最前沿但也最有风险的一部分。在本课内容中,你将透彻理解深度学习的动机,并设计用于了解复杂和/或大量数据库的智能系统。

我们将教授你如何训练和优化基本神经网络、卷积神经网络和长短期记忆网络。你将通过项目和任务接触完整的机器学习系统 TensorFlow。你将学习解决一系列曾经以为非常具有挑战性的新问题,并在你用深度学习方法轻松解决这些问题的过程中更好地了解人工智能的复杂属性。

我们与 Google 的首席科学家兼 Google 智囊团技术经理 Vincent Vanhoucke 联合开发了本课内容。此课程提供中文版本。

5. Udacity 纳米基石学位项目:深度学习

Udacity的纳米基石学位项目,收费课程,不过据说更注重实战:

人工智能正颠覆式地改变着我们的世界,而背后推动这场进步的,正是深度学习技术。优达学城和硅谷技术明星一起,带来这门帮你系统性入门的课程。你将通过充满活力的硅谷课程内容、独家实战项目和专业代码审阅,快速掌握深度学习的基础知识和前沿应用。

你在实战项目中的每行代码都会获得专业审阅和反馈,还可以在同步学习小组中,接受学长、导师全程的辅导和督促

6. fast.ai 上的深度学习系列课程

fast.ai上提供了几门深度学习课程,课程标语很有意思:Making neural nets uncool again ,并且 Our courses (all are free and have no ads):

Deep Learning Part 1: Practical Deep Learning for Coders
Why we created the course
What we cover in the course
Deep Learning Part 2: Cutting Edge Deep Learning for Coders
Computational Linear Algebra: Online textbook and Videos
Providing a Good Education in Deep Learning—our teaching philosophy
A Unique Path to Deep Learning Expertise—our teaching approach

7. 台大李宏毅老师深度学习课程:Machine Learning and having it Deep and Structured

难得的免费中文深度学习课程:

课程主页:http://speech.ee.ntu.edu.tw/~tlkagk/courses_MLDS17.html
课程视频Playlist: https://www.youtube.com/playlist?list=PLJV_el3uVTsPMxPbjeX7PicgWbY7F8wW9
B站搬运深度学习课程视频: https://www.bilibili.com/video/av9770302/

8. 台大陈缊侬老师深度学习应用课程:Applied Deep Learning / Machine Learning and Having It Deep and Structured

据说是美女老师,这门课程16年秋季开过一次,不过没有视频,最新的这期是17年秋季课程,刚刚开课,Youtube上正在陆续放出课程视频:

16年课程主页,有Slides等相关资料:https://www.csie.ntu.edu.tw/~yvchen/f105-adl/index.html
17年课程主页,资料正在陆续放出:https://www.csie.ntu.edu.tw/~yvchen/f106-adl/
Youtube视频,目前没有playlist,可以关注其官方号放出的视频:https://www.youtube.com/channel/UCyB2RBqKbxDPGCs1PokeUiA/videos

9. Yann Lecun 深度学习公开课

“Yann Lecun 在 2016 年初于法兰西学院开课,这是其中关于深度学习的 8 堂课。当时是用法语授课,后来加入了英文字幕。
作为人工智能领域大牛和 Facebook AI 实验室(FAIR)的负责人,Yann Lecun 身处业内机器学习研究的最前沿。他曾经公开表示,现有的一些机器学习公开课内容已经有些过时。通过 Yann Lecun 的课程能了解到近几年深度学习研究的最新进展。该系列可作为探索深度学习的进阶课程。”

10. 2016 年蒙特利尔深度学习暑期班

推荐理由:看看嘉宾阵容吧,Yoshua Bengio 教授循环神经网络,Surya Ganguli 教授理论神经科学与深度学习理论,Sumit Chopra 教授 reasoning summit 和 attention,Jeff Dean 讲解 TensorFlow 大规模机器学习,Ruslan Salakhutdinov 讲解学习深度生成式模型,Ryan Olson 讲解深度学习的 GPU 编程,等等。

11. 斯坦福大学深度学习应用课程:CS231n: Convolutional Neural Networks for Visual Recognition

这门面向计算机视觉的深度学习课程由Fei-Fei Li教授掌舵,内容面向斯坦福大学学生,货真价实,评价颇高:

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This course is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 10-week course, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. The final assignment will involve training a multi-million parameter convolutional neural network and applying it on the largest image classification dataset (ImageNet). We will focus on teaching how to set up the problem of image recognition, the learning algorithms (e.g. backpropagation), practical engineering tricks for training and fine-tuning the networks and guide the students through hands-on assignments and a final course project. Much of the background and materials of this course will be drawn from the ImageNet Challenge.

12. 斯坦福大学深度学习应用课程: Natural Language Processing with Deep Learning

这门课程由NLP领域的大牛 Chris Manning 和 Richard Socher 执掌,绝对是学习深度学习自然语言处理的不二法门。

Natural language processing (NLP) is one of the most important technologies of the information age. Understanding complex language utterances is also a crucial part of artificial intelligence. Applications of NLP are everywhere because people communicate most everything in language: web search, advertisement, emails, customer service, language translation, radiology reports, etc. There are a large variety of underlying tasks and machine learning models behind NLP applications. Recently, deep learning approaches have obtained very high performance across many different NLP tasks. These models can often be trained with a single end-to-end model and do not require traditional, task-specific feature engineering. In this winter quarter course students will learn to implement, train, debug, visualize and invent their own neural network models. The course provides a thorough introduction to cutting-edge research in deep learning applied to NLP. On the model side we will cover word vector representations, window-based neural networks, recurrent neural networks, long-short-term-memory models, recursive neural networks, convolutional neural networks as well as some recent models involving a memory component. Through lectures and programming assignments students will learn the necessary engineering tricks for making neural networks work on practical problems.

这门课程融合了两位授课者之前在斯坦福大学的授课课程,分别是自然语言处理课程 cs224n (Natural Language Processing)和面向自然语言处理的深度学习课程 cs224d (Deep Learning for Natural Language Processing).

13. 斯坦福大学深度学习课程: CS 20SI: Tensorflow for Deep Learning Research

准确的说,这门课程主要是针对深度学习工具Tensorflow的:

Tensorflow is a powerful open-source software library for machine learning developed by researchers at Google Brain. It has many pre-built functions to ease the task of building different neural networks. Tensorflow allows distribution of computation across different computers, as well as multiple CPUs and GPUs within a single machine. TensorFlow provides a Python API, as well as a less documented C++ API. For this course, we will be using Python.

This course will cover the fundamentals and contemporary usage of the Tensorflow library for deep learning research. We aim to help students understand the graphical computational model of Tensorflow, explore the functions it has to offer, and learn how to build and structure models best suited for a deep learning project. Through the course, students will use Tensorflow to build models of different complexity, from simple linear/logistic regression to convolutional neural network and recurrent neural networks with LSTM to solve tasks such as word embeddings, translation, optical character recognition. Students will also learn best practices to structure a model and manage research experiments.

14. 牛津大学 & DeepMind 联合的面向NLP的深度学习应用课程: Deep Learning for Natural Language Processing: 2016-2017

课程主页:https://www.cs.ox.ac.uk/teaching/courses/2016-2017/dl/

github课程项目页面:https://github.com/oxford-cs-deepnlp-2017/

课程视频Playlist: https://www.youtube.com/playlist?list=PL613dYIGMXoZBtZhbyiBqb0QtgK6oJbpm

B站搬运视频: https://www.bilibili.com/video/av9817911/

15. 卡耐基梅隆大学(CMU)深度学习应用课程:CMU CS 11-747, Fall 2017 Neural Networks for NLP

课程主页:http://phontron.com/class/nn4nlp2017/

课程视频Playlist: https://www.youtube.com/watch?v=Sss2EA4hhBQ&list=PL8PYTP1V4I8ABXzdqtOpB_eqBlVAz_xPT

16. MIT组织的一个为期一周的深度学习课程: 6.S191: Introduction to Deep Learning http://introtodeeplearning.com/

17. 奈良先端科学技術大学院大学(NAIST) 2014年推出的一个深度学习短期课程(英文授课):Deep Learning and Neural Networks

18. Deep Learning course: lecture slides and lab notebooks

欢迎大家推荐其他没有覆盖到的深度学习课程。

注:原创文章,转载请注明出处“课程图谱博客”:http://blog.coursegraph.com

本文链接地址:http://blog.coursegraph.com/深度学习课程资源整理

机器学习公开课资源更新

之前有很多同学在《Coursera课程下载和存档计划更新及索引》下留言相关的公开课资源链接失效,尝试做过一些更新,但是很快更新的链接也失效,另外限于自己工作也比较忙,所以这个事情渐渐搁置了。这次准备分批统一更新一下相关的课程资源,如果更新的链接依然很快失效,感兴趣的同学可以关注我们的公众号:NLPJob,回复”Coursera”获取相关资源,我们会在后期统一整理相关资源链接进行更新。

本期更新机器学习相关课程资源链接,之前的帖子也会同步更新:

1、斯坦福大学 Andrew Ng 机器学习 (Machine Learning)

该课程已经在Coursera新的课程平台上发布(https://www.coursera.org/learn/machine-learning),在线的课程资源依然会得到保留,优先推荐在线学习,可以做练习,可以提交作业,这里分享的百度网盘资源包含两个版本,来自于之前大家的分享:

链接: https://pan.baidu.com/s/1bBVtIQ 密码: 26hc

2、华盛顿大学 Pedro Domingos 机器学习 (Machine Learning)

该课程一直没有开课,但是可以preview,视频量很足,类容丰富,用Coursera Downloader下载后大约5G,是目前所有课程下载中占用空间最大的。

链接: https://pan.baidu.com/s/1o8meCps 密码: tekb

3、台湾大学 林軒田 机器学习基石 (Machine Learning Foundations)

该课程在课程图谱上的评价很高,10条评价全是五星,而且评论都很精彩。据说林老师现在创业去了,这门课说不定也会成为绝版,赶紧收藏吧,有三个版本,来自于之前大家的分享或者网上的公开资源。

链接: http://pan.baidu.com/s/1hsmAsNq 密码: kxfj

4、台湾大学 林軒田 机器学习技法(Machine Learning Techniques)

机器学习基石 (Machine Learning Foundations)课程姊妹篇,或者下部,难度依然很高,所以货真价实,值得收藏。

链接: http://pan.baidu.com/s/1bpHSAPD 密码: abye

5、多伦多大学 Geoffrey Hinton 面向机器学习的神经网络(Neural Networks for Machine Learning)

Geoffrey Hinton大神在Coursera上的这门课程只在2012年开过一轮,这次应该不会进行迁移了:

“Deep learning必修课”

“宗派大师+开拓者直接讲课,秒杀一切二流子”

看看上面的点评,对深度学习感兴趣的同学赶紧保存,本次分享包含两个版本,均为之前大家的分享:

链接: https://pan.baidu.com/s/1sl0R7PV 密码: k4ui

6、斯坦福大学 Daphne Koller 概率图模型公开课(Probabilistic Graphical Models)

这次应该也不会迁移了,想当年多少大神在Coursera上开课。。。本次分享有两个版本,来自于之前大家的分享和网络上的可查资源:

链接: https://pan.baidu.com/s/1hr4X2YS 密码: n5j9

请尽快保存,下次失效后再补不知道什么时候。

注:原创文章,转载请注明出处“课程图谱博客”:http://blog.coursegraph.com

本文链接地址:http://blog.coursegraph.com/机器学习公开课资源更新

Coursera课程下载和存档计划更新及索引

更新了一下github上“Coursera Archive”项目的相关课程链接,如果在Coursera新课程平台上有的,就更新为新平台链接,如果没有,就保留课程图谱上的链接,供大家参考。

从目前更新的链接来看,有的课程得到了保留,也有的直接不复存在,譬如机器学习的相关课程, 台大林軒田老师的两门机器学习课程就没有了,但是大神Geoffrey Hinton的“面向机器学习”的神经网络课程貌似又复活了,Coursera新课程平台上显示的是2016年9月份开课,大家可以拭目以待。又例如自然语言处理的相关课程,只有 Michael Collins 大神的自然语言处理课程丢失,其他3门课程在新课程平台上均有所保留,情况貌似没有那么糟。另外斯坦福大学的两门算法设计与分析课程,刚刚开课,感兴趣的同学可以直接去上课了。

最后附上 “Coursera课程下载和存档计划” 相关索引,仅供查询和参考:

  1. Coursera Downloader 下载工具
  2. Coursera课程速查表
  3. 机器学习 & 自然语言处理 & 推荐系统 & 数据挖掘相关公开课
  4. 计算机科学基础公开课
  5. 其他课程资源

注:原创文章,转载请注明出处“课程图谱博客”:http://blog.coursegraph.com

本文链接地址:http://blog.coursegraph.com/coursera课程下载和存档计划更新及索引

Coursera课程下载和存档计划三:机器学习 & 自然语言处理 & 推荐系统 & 数据挖掘相关公开课

周末对之前保存和下载的Coursera课程做了一下整理和归类,先送出机器学习、自然语言处理、推荐系统和数据挖掘相关的14门课程资源。这些公开课资源很多来自于之前课程图谱群内朋友的或者微博上的朋友的分享,这里做了一些补充,主要针对Coursera旧课程平台的课程进行备份和分享,对于一些已经迁移至新课程平台的课程,希望大家直接去Coursra新课程平台上去上课,这样会有更好的学习体验。最后特别说明的是,课程网盘资源仅供个人备份学习使用。

关于“Coursera课程下载和存档计划”,请参考:

  1. Coursera Downloader 下载工具
  2. Coursera课程速查表

以下是按几个类别整理的相关课程资源,相关信息我们也会同步到“Coursera Archive”项目上去:

机器学习相关课程:

1、斯坦福大学 Andrew Ng 机器学习 (Machine Learning)

该课程已经在Coursera新的课程平台上发布(https://www.coursera.org/learn/machine-learning),在线的课程资源依然会得到保留,优先推荐在线学习,可以做练习,可以提交作业,这里分享的百度网盘资源包含两个版本,来自于之前大家的分享:

链接: https://pan.baidu.com/s/1bBVtIQ 密码: 26hc

2、华盛顿大学 Pedro Domingos 机器学习 (Machine Learning)

该课程一直没有开课,但是可以preview,视频量很足,类容丰富,用Coursera Downloader下载后大约5G,是目前所有课程下载中占用空间最大的。

链接: https://pan.baidu.com/s/1o8meCps 密码: tekb

3、台湾大学 林軒田 机器学习基石 (Machine Learning Foundations)

该课程在课程图谱上的评价很高,10条评价全是五星,而且评论都很精彩。据说林老师现在创业去了,这门课说不定也会成为绝版,赶紧收藏吧,有三个版本,来自于之前大家的分享或者网上的公开资源。

链接: http://pan.baidu.com/s/1hsmAsNq 密码: kxfj

4、台湾大学 林軒田 机器学习技法(Machine Learning Techniques)

机器学习基石 (Machine Learning Foundations)课程姊妹篇,或者下部,难度依然很高,所以货真价实,值得收藏。

链接: http://pan.baidu.com/s/1bpHSAPD 密码: abye

5、多伦多大学 Geoffrey Hinton 面向机器学习的神经网络(Neural Networks for Machine Learning)

Geoffrey Hinton大神在Coursera上的这门课程只在2012年开过一轮,这次应该不会进行迁移了:

“Deep learning必修课”

“宗派大师+开拓者直接讲课,秒杀一切二流子”

看看上面的点评,对深度学习感兴趣的同学赶紧保存,本次分享包含两个版本,均为之前大家的分享:

链接: https://pan.baidu.com/s/1sl0R7PV 密码: k4ui

6、斯坦福大学 Daphne Koller 概率图模型公开课(Probabilistic Graphical Models)

这次应该也不会迁移了,想当年多少大神在Coursera上开课。。。本次分享有两个版本,来自于之前大家的分享和网络上的可查资源:

链接: https://pan.baidu.com/s/1hr4X2YS 密码: n5j9


自然语言处理相关课程

7、哥伦比亚大学 Michael Collins 自然语言处理公开课(Natural Language Processing)

NLP大神的课程,必须备份,来自之前一个朋友的分享:
链接: http://pan.baidu.com/s/1hsbKYK8 密码: ines

Update: 链接: https://pan.baidu.com/s/1c2JpM28 密码: 9dwx

8、斯坦福大学 Dan Jurafsky和Christopher Manning 自然语言处理(Natural Language Processing)

这门课程的授课老师是斯坦福教授Dan Jurafsky和Christopher Manning,两位都是NLP领域的大大牛,其他不说,仅仅是他们写的书应该是很多NLPer的入门书:前者写了《Speech and Language Processing》,中文译名《自然语言处理综论》,后者写了《Foundations of Statistical Natural Language Processing》,中文译名《统计自然语言处理基础》,这两本几乎是NLPer的入门必读书籍。

用coursera-dl下载了一份并上传到百度网盘备份,需要的同学尽快保存:

链接: http://pan.baidu.com/s/1jHKfXQm 密码: s6hx

Update: http://pan.baidu.com/s/1nvbEOFf 密码: pjzd

9、密歇根大学 Dragomir R. Radev Introduction to Natural Language Processing(自然语言处理导论)

这门课程了解不是太多,下载了一份作为备份:

链接: http://pan.baidu.com/s/1nu5MFVj 密码: 3t3h

10、伊利诺伊大学厄巴纳香槟分校 翟成祥(ChengXiang Zhai) Text Mining and Analytics(文本挖掘与分析)

这门课程已经切换到Coursera新课程平台:https://www.coursera.org/learn/text-mining ,最新一轮课程将于2016年7月11号开课,推荐感兴趣的同学直接在线学习,体验MOOC平台的诸多好处。

推荐系统相关课程:

11、明尼苏达大学 Joseph Konstan 和 Michael D Ekstrand Introduction to Recommender Systems(推荐系统导论)

这门课程已经切换到Coursera新课程平台:https://www.coursera.org/learn/recommender-systems ,最新一轮课程刚刚于2016年6月13号开课,推荐感兴趣的同学直接加入学习。以下提供一个网盘资源,是一个全部课程的打包压缩文件:

链接: http://pan.baidu.com/s/1pLy7uvL 密码: ui1u

数据挖掘相关课程

12、斯坦福大学 Jeff Ullman & Anand Rajaraman & Jure Leskovec Mining Massive Datasets

这门课程的授课老师之一是巨牛Jeff Ullman,他是计算机领域鼎鼎大名的“龙书”《编译原理》及数据库领域权威指南《数据库系统实现》这两本书的作者之一,谷歌创始人Sergey Brin亦是他的学生之一。该课程对应一个官方主页:http://www.mmds.org/,提供课程和书籍的相关资源,全部开放。所对应的同名书籍中文译名为《大数据 互联网大规模数据挖掘与分布式处理》,由王斌老师翻译,已出第二版。网盘资源来自于大家的分享,包括两个版本和一个英文版电子书籍:

链接: http://pan.baidu.com/s/1c81pRC 密码: e25n

13、伊利诺伊大学厄巴纳-香槟分校 Jiawei Han Pattern Discovery in Data Mining(数据挖掘中的模式发现)

授课老师 Jiawei Han 是数据挖掘领域国际知名学者,这门课程目前已经迁移到Coursera新的课程平台 https://www.coursera.org/learn/data-patterns ,新一轮课程将于8月底开课,感兴趣的同学可以关注。

14、伊利诺伊大学厄巴纳-香槟分校 Jiawei Han Cluster Analysis in Data Mining(数据挖掘中的聚类分析)

同上一门课程构成姊妹篇,目前也已经迁移到Coursera新的课程平台 https://www.coursera.org/learn/cluster-analysis ,新一轮课程将于10月初开课,感兴趣的同学可以关注。

注:原创文章,转载请注明出处“课程图谱博客”:http://blog.coursegraph.com

本文链接地址:http://blog.coursegraph.com/coursera课程下载和存档计划三

Coursera课程下载和存档计划一:Coursera Downloader 下载工具

上周三收到Coursera平台的群发邮件,大意是Coursera将在6月30号彻底关闭旧的课程平台,全面升级到新的课程平台上,一些旧的课程资源(课程视频、课程资料)将不再保存,如果你之前学习过相关的课程,或者有心仪的课程,Coursera建议你将这些课程资源下载下来备份。

说实话,自从Coursera这一两年逐渐进行“商业升级”以后,我已经很少在这个平台上学习公开课了,反而是edX的一些课程更吸引我,特别是课程质量,后者显得更用心很多。不过作为最早的MOOC平台Coursera,曾经诞生了很多经典课程,要是这些课程真的随Coursera平台的切换而丢失,实在可惜。这里曾经整理过一批“公开课可下载资源汇总”,很多来自于大家的贡献和分享,不过这也是两三年前的事情,一些课程网盘资源已经失效,这封邮件促使我开始检查这些网盘资源,特别是来自Coursera平台的课程资源。之前有些课程资源没有下载或者没有网盘资源,以为只要有Coursera账号,就可以随时登陆上去在线观看就可以了,也没有下载的欲望,现在不同了,例如斯坦福大学Dan Jurafsky和Christopher Manning的自然语言处理课程,例如一直没有开课却可以preview观看的大牛Pedro Domingos的机器学习课程,下载和备份是必须的。

工欲善其事,必先利其器,针对Coursera的下载工具有很多,包括一些浏览器插件,不过这里推荐的是Python下载工具Coursera Downloader, 简称coursera-dl。这个神器早在几年前我就用过,印象深刻,这次重拾,依然非常方便给力。最简单的安装方法是“pip install coursera”,可参考github上该项目的安装说明。下面以Mac OS系统为例简单说明一下基于virtualenv的安装使用方法,该方法对ubuntu这样的linux系统应该有效,windows下没有测试,未知。

首先从github上获取代码,git clone或者直接下载zip源码文件均可:

git clone https://github.com/coursera-dl/coursera-dl

Cloning into ‘coursera-dl’…
remote: Counting objects: 3357, done.
remote: Compressing objects: 100% (14/14), done.
remote: Total 3357 (delta 6), reused 0 (delta 0), pack-reused 3343
Receiving objects: 100% (3357/3357), 1.39 MiB | 75 KiB/s, done.
Resolving deltas: 100% (1852/1852), done.

cd coursera-dl/

virtualenv my-coursera

New python executable in /Users/xxxxxx/project/mooc/test/coursera-dl/my-coursera/bin/python
Installing setuptools, pip, wheel…done.

source my-coursera/bin/activate

pip install -r requirements.txt

Collecting beautifulsoup4>=4.1.3 (from -r requirements.txt (line 1))
…..
Installing collected packages: beautifulsoup4, six, html5lib, requests, urllib3, pyasn1, keyring
Successfully installed beautifulsoup4-4.4.1 html5lib-1.0b8 keyring-9.0 pyasn1-0.1.9 requests-2.10.0 six-1.10.0 urllib3-1.16

安装完毕,以下是coursera-dl的详细用法:

General: coursera-dl -u -p modelthinking-004
Multiple classes: coursera-dl -u -p saas historyofrock1-001 algo-2012-002
Filter by section name: coursera-dl -u -p -sf “Chapter_Four” crypto-004
Filter by lecture name: coursera-dl -u -p -lf “3.1_” ml-2012-002
Download only ppt files: coursera-dl -u -p -f “ppt” qcomp-2012-001
Use a ~/.netrc file: coursera-dl -n — matrix-001
Get the preview classes: coursera-dl -n -b ni-001
Specify download path: coursera-dl -n –path=C:\Coursera\Classes\ comnetworks-002
Display help: coursera-dl –help

Maintain a list of classes in a dir:
Initialize: mkdir -p CURRENT/{class1,class2,..classN}
Update: coursera-dl -n –path CURRENT `\ls CURRENT`

我们以Coursera上密歇根大学的“自然语言处理入门”课程为例,在旧的课程课程主页“Introduction to Natural Language Processing”,首先需要加入(Enroll)该课程的一个班次,目前只有2015年10月到12月开过一轮课,加入该轮课程后,进入到课程详细页面,可以看到网页链接类似这个形式:

https://class.coursera.org/nlpintro-001/lecture

对于Coursera Downloader来说,主要需要的就是这个”nlpintro-001″课程班次短链接,然后就可以尝试下载了,这里用 –path指定了课程下载路径:

coursera-dl -u 用户邮箱 -p 用户密码 --path=../../coursera_backup/ nlpintro-001

然后就开始了下载历程。。。。。。可能和网络有关,这个下载有时候会中断或者停止不动假死,coursera-dl提供了一个“Resuming downloads”模式,类似于“断点续传”,非常有用,可以用如下命令恢复之前中断的下载:

coursera-dl -u 用户邮箱 -p 用户密码 --path=../../coursera_backup/ --resume nlpintro-001

这种加入课程然后下载课程资料的方法比较全,除了课程视频外,还可以下载课程相关的课件和字幕。如果你没有加入课程,Coursera Downloader提供了一个下载preview课程的方法,不过只能下载课程视频,但是前提是你必须有Coursera账号。以一直没有开课却可以preview观看的大牛Pedro Domingos的机器学习课程为例,点击该课程主页Machine Learning上的”Preview lectures”按钮,即可得到课程预览链接“https://class.coursera.org/machlearning-001/lecture/preview”,按照Coursera Downloader上的方法,需要预先在用户主目录下设置一个 ~/.netrc 文件,文件格式如下:

machine coursera-dl login 用户邮箱 password 用户密码

非常重要的是,你需要把设置一下 ~/.netrc 的权限:

chmod og-rw ~/.netrc

否则,会遇到如下的错误,我已经踩过这个坑了:

~/.netrc access too permissive: access permissions must restrict access to only the owner

之后就可以用如下命令下载preview的课程视频文件了:

coursera-dl -n -b --path=../../coursera_backup/ machlearning-001

希望大家用这个工具或其他工具尽快保存Coursera自己心仪的课程,如果方便的话,上传到相关的网盘,做个分享,一方面自己做个备份,另一方便方便大家共享学习资源。这里先附上已经整理的5门Coursera公开课资源,部分课程资源还在下载和上传中,之后将陆续整理发布。

1、机器学习课程 by Andrew Ng

该课程已经在Coursera新的课程平台上发布(https://www.coursera.org/learn/machine-learning),在线的课程资源依然会得到保留,这里分享的百度网盘资源包含两个版本,来自于之前大家的分享:

链接: http://pan.baidu.com/s/1miMZHQo 密码: aeck

2、面向机器学习的神经网络(Neural Networks for Machine Learning)by Geoffrey Hinton

Geoffrey Hinton大神在Coursera上的这门课程只在2012年开过一轮,这次应该不会进行迁移了:

“Deep learning必修课”

“宗派大师+开拓者直接讲课,秒杀一切二流子”

看看上面的点评,对深度学习感兴趣的同学赶紧保存,本次分享包含两个版本,均为之前大家的分享:

链接: http://pan.baidu.com/s/1sk9cgK9 密码: ndm9

3、Daphne Koller教授的“概率图模型公开课(Probabilistic Graphical Models)

这次应该也不会迁移了,想当年多少大神在Coursera上开课。。。本次分享为之前一个朋友的共享:

链接: http://pan.baidu.com/s/1kVpRMKn 密码: 244s

4、Michael Collins大神的“自然语言处理公开课(Natural Language Processing)

NLP大神的课程,必须备份,来自之前一个朋友的分享:

链接: http://pan.baidu.com/s/1kV72IhT 密码: fxjw

5、斯坦福大学Dan Jurafsky和Christopher Manning两位大牛的“自然语言处理公开课(Natural Language Processing)

这门课程的授课老师是斯坦福教授Dan Jurafsky和Christopher Manning,两位都是NLP领域的大大牛,其他不说,仅仅是他们写的书应该是很多NLPer的入门书:前者写了《自然语言处理综论》,后者写了《统计自然语言处理基础》。

我用coursera-dl下载了一份并上传到百度网盘备份,需要的同学尽快保存:

链接: http://pan.baidu.com/s/1hrGMbkg 密码: a2w5

附Coursera邮件内容:

Save course materials for some courses by June 30

Dear XXX,

We wanted to inform you of an update to our technology platform that will affect access to some courses you previously joined.

In 2014, Coursera began developing a new technology platform to improve your learning experience, and to allow courses to run more frequently. The majority of our courses are now offered on the new platform.
This month, we are closing the old platform. One or more courses you joined are on the old platform.
Effective June 30, 2016, courses on the old platform will no longer be available. You should use this opportunity to save any relevant course materials or assignments.

How does this affect my courses?

Any courses and course materials on our old platform will no longer be accessible after June 30. Until that date, we encourage you to save any content you need for personal use and reference.
Any courses on the new platform will not be affected by this change.

Will this affect earned Certificates?
All Statements of Accomplishment (SoA) and Verified Certificates will remain accessible in your Accomplishments page, as long as you do not unenroll from courses you have completed on the old Cplatform. You are also welcome to download a copy for your records at any time. Statements and Certificates that you have shared to LinkedIn will also be maintained on your LinkedIn profile after June 30.

How do I know if a course is on the “old platform”?

If you aren’t sure which platform a course is on currently, navigate to the course and check the URL in the browser bar – courses on the old platform have URLs that begin with class.coursera.org (rather than then new platform, which uses the URL coursera.org/learn.)

How do I save course materials?

To save course materials from the old platform for reference:
• Download any lecture slides or videos that you would like to save for reference
• Save a record of your quizzes and other assignments by taking screenshots

More questions?

If you have a technical issue with your account, please visit our Help Center.
Thank you for being a part of our learning community, and for your patience and understanding through this product transition! We are excited to continue to improve the learning experience on Coursera, and we look forward to bringing you more great courses on the new platform.

注:原创文章,转载请注明出处“课程图谱博客”:http://blog.coursegraph.com

本文链接地址:http://blog.coursegraph.com/coursera课程下载和存档计划一

机器学习公开课汇总

机器学习目前比较热,网上也散落着很多相关的公开课和学习资源,这里基于课程图谱机器学习公开课标签做一个汇总整理,便于大家参考对比。

1、Coursera上斯坦福大学Andrew Ng教授的“机器学习公开课”:

机器学习入门课程首选,斯坦福大学教授,Coursera联合创始人Andrew Ng老师的课程,课程图谱上多达400多人关注,20余条课程评论,绝大多数同学认为这门课程比较适合入门,以下选择其中几位同学的课程评论:

@ototsuyume 同学评价:非常好的一门入门课程。很多人诟病作业的代码给得太全,但我认为作为一门入门课程,编程作业设置得十分好,各种机器学习的作用能很直观地展示出来,这样很能激发学习兴趣。试想一下,假如不给你任何框架代码让你从头开始写,写完后得出的结果是一堆用来提交的无味的数据,对于一名初学者来说,这多么打击积极性。
这门课程极其简化了各种数学的证明,类似svm跟pca中间的求解过程都讲得很简略。要求的数学基础是低得不能再低了,所以即使是毕业几年后概率矩阵忘得差不多的人都能看懂。除去初学者之外,这门课程也很适合工作中需要用到一些机器学习但不打算深入研究的程序员。

@极度视界 同学评价:这个必然 5 星,很棒的入门课程;Ng老湿把编程作业设计到极为简单,而数据集并不单调,垃圾邮件分类/手写体分类等。学此课的同学,应该尝试丢掉Ng老湿给的框架,自己写一套算法,才好。这课我得了100%。

最后再推荐 @小小人_V 同学这门课程的学习笔记: http://vdisk.weibo.com/s/J4rRX/1373287206

2、Coursera上台大林軒田老师的“機器學習基石 (Machine Learning Foundations)公开课”:

课程正在进行中,目前感觉很不错,林老师年轻有为,也是机器学习畅销书《Learning from Data》的作者之一,课程的难度应该比上面Andrew Ng老师机器学习公开课的高一些,不过比较重要的是这门课程用中文讲解,比较适合国人:

@尘绳聋 同学评价:看老师给出的课程大纲,基本还是照着Caltech/Edx LFD的节奏走。之前跟过LFD,这次就当复习了。当然也有一些新的东西,譬如PLA的收敛证明和收敛需要的次数上界,Lecture3对learning types的介绍也很详细,原来reinforcement learning还可以用在ad system上面,看来要把Ng CS229后面的那一大块有关reinforcement learning的内容啃一下了。另外,老师讲得非常好,从video和slide也能看出很用心。

@飞林沙 同学评价: 刚听完前两讲,讲的真的非常棒!从最基本的PLA讲起,虽然很简单,但是跟着自己动手写写代码,做做数学公式,就当休息了,很棒。

林老师推出的这门课程的姊妹篇“機器學習技法 (Machine Learning Techniques)” 已于2014.12.23开课,值得关注。

3、edX上加州理工学院的“Learning From Data

和上面台大机器学习课程渊源很深,内容基本上出自加州理工的这本同名教材《Learning From Data》,林老师也曾在该校读博,这门课程的授课老师也是他的导师Abu Mostafa教授。

4、Coursera上多伦多大学Geoffrey Hinton大神的“Neural Networks for Machine Learning”公开课

这门课程主要关注神经网络以及它们在机器学习中的应用,在目前火热烫手的Deep Learning概念衬托下,这门课程简直就是必修课,不过遗憾的是这门课程只在12年10月份开过一轮,目前为止还没有开课的意思,不过好在我们还有网盘资源的备份,具体信息在“公开课可下载资源汇总”中自行查找:

@yongsun:还有什么好说的呢?Deep Learning必修课程啊!

@godenlove007:宗派大师+开拓者直接讲课,秒杀一切二流子!

@wzyer:巨牛级别的人物来开课,我也不说啥了。

5、Coursera上华盛顿大学Pedro Domingos教授的“机器学习公开课

Coursera上一门还没有正式开始过的机器学习课程,老师是机器学习的大牛Pedro Domingos,他写过的“”A Few Useful Things to Know about Machine Learning”广为流传,这门课虽然没有正式开始,但是通过preview的链接可以看课程的所有视频。@wzyer 大神的评价:个人觉得这门课比Andrew那个更深入些,老师讲的也不错。不过这个似乎就没有正式开过,我都enroll半年多了……

6、网易公开课收录的“斯坦福大学公开课 :机器学习课程

这是老一代的公开课,老师仍然是Andrew Ng教授,不过视频来自于斯坦福大学的课堂录制视频,课程难度要高一些,可以作为Ng老师Coursera上“机器学习公开课”的进阶课程,好处是有翻译字幕,比较方便国内同学的学习。

7、网易公开课收录的“加州理工学院公开课:机器学习与数据挖掘

其实就是edX上“Learning From Data”的原版课程,授课老师依然是Abu Mostafa教授,edX上老师在论坛上和同学互动,而网易公开课上有翻译。

8、超星学术上来自于贝尔实验室的“机器学习”课程:

来自于超星学术上的课程,具体情况不太清楚。

9、最后推荐的是国内龙星计划机器学习课程资源:

1)2012龙星计划机器学习课程的视频及课件

来自微博上@SunnyerEric孙晗晓 同学的信息 : 龙星计划机器学习课程的视频:http://t.cn/zlA2ZHb

网盘地址:http://pan.baidu.com/share/link?shareid=27613&uk=1513052211

关于龙星计划的课件,大家也可以在如下地方找到:

2012年龙星计划-机器学习课件

2)2013龙星计划深度学习(Deep Learning)课程视频

@龙星计划
龙星计划天津站 邓力老师的讲课视频 http://t.cn/zQixW12

@戴玮_CASIA
天津大学深度学习龙星计划课程视频:http://t.cn/zQixW12

网盘地址:http://pan.baidu.com/share/link?shareid=3220401770&uk=723014463

注:原创文章,转载请注明出处“课程图谱博客”:blog.coursegraph.com

本文链接地址:http://blog.coursegraph.com/机器学习公开课汇总