CS231Nassignment2之FCnet

Posted on 2019-04-11 | Edited on 2019-05-07 | In 图像处理 , Deep Learning , CS231n作业

This part is from the assignment 2018:
stanford cs231n assignment2

目标

之前已经实现了两层的fc net，但是在这个网络里面的loss和gradient的计算用的是数学方法
这样的计算可以在两层的网络里实现，但是多层的情况下实现起来太困难了
所以在这里把电脑分成了forward pass和backward pass

forward的过程中，接受所有的input，weights，和其他的参数，返回output和cache（存储back的时候需要的东西）

def layer_forward(x, w):
  """ Receive inputs x and weights w """
  # Do some computations ...
  z = # ... some intermediate value
  # Do some more computations ...
  out = # the output

  cache = (x, w, z, out) # Values we need to compute gradients

  return out, cache

back的时候会接受derivative和之前存储的cache，然后计算最后的gradient

def layer_backward(dout, cache):
  """
  Receive dout (derivative of loss with respect to outputs) and cache,
  and compute derivative with respect to inputs.
  """
  # Unpack cache values
  x, w, z, out = cache

  # Use values in cache to compute derivatives
  dx = # Derivative of loss with respect to x
  dw = # Derivative of loss with respect to w

  return dx, dw

这样就可以组合各个部分达到最终需要的效果了，无论多深都可以实现了
还需要一部分的优化部分，包括Dropout，Batch/Layer的Normalization

关于空间投影增强（SAR）的论文

Posted on 2019-04-11 | Edited on 2019-04-16 | In Papers

2019 1Q AR笔记

Posted on 2019-04-09 | Edited on 2019-04-22 | In AR , 上课笔记

整体步骤

检测marker
为他计算计算6DOF pose
render 3D 图片
把图片和marker组合在一起

slides
学习opengl
(学习老师的cpp代码风格)

Notes for papers about smart kitchen

Posted on 2019-04-08 | In Papers

Choptop: An Interactive Chopping Board

2018

abstract

一个可交互的案板
可以给用户菜谱的指导，承重，计时
用户可以通过按案板来进行对画面的操作

1-1
上面就是长成上图的样子，案板底下充满了各种传感器。

Intro

针对学生不会自己做饭，不吃新鲜的饭的问题，缺时间 -> 一步一步的把怎么做饭写出来了，包括图片动画等东西
built-in timer用来每一步计时
使用mobile device来提高菜单的交互性
考虑到手的脏等问题，所以不是按屏幕而是按案板（这里考虑能不能像pac pac一样，用手势操作）
load sensors，可以解决称重的问题

思路

主要目的是一种新的学习做饭的方法（作为做饭有天赋的人，我认为这样没有灵魂！）

smart kitchen
- Research has aimed to improve the cook- ing process, promote healthier eating and make it simpler to procure ingredients，大家都从不同的角度实现智能厨房
- 他这个论文的东西成本不是特别高也不是特别大（是在嘲讽我吗）
load sensing
- 之前已经用了很多force的传感器
- 之前也有用过带重量传感器的案板，以及带扭矩传感器的刀[9]
- 他们认为camera没有什么用，并且把所有的硬件都藏起来了
- 装这个senser的方法参考了[12]

Design

硬件

整个硬件是self-contained的
屏幕是单片机控制的

力量传感系用

检测按压用的是edge detection -> 防止检测到其他东西（原理不是太懂）

UI

The interface updates based on the information delivered from the recognition engine
成功之后还会有声音
按案板的不同部分就可以往不同的方向移动

user study

找了十三个人，准备一道沙拉
[3]里面找到了一个调查问卷System Usability Scale

future

SVM训练了74%的测试率（好低），提高正确率
提高自动翻页（？
用用户的手机来达到屏幕的作用
进一步分user study

CookTab: Smart cutting board for creating recipe with real-time feedback

2012

abstract

考虑到很多厨师做饭随心所欲，而且不会记录下来精准的用量
一个可以记录下来用量的案板

intro

专门针对切菜部分的记录的软件
记录用的材料的名字，菜量，视频和调味方法，然后会有real-time的feedback

[3]可以记录视频，声音，用的camera和mic
加重量感应的，
[2]四个承重的模块，加速度传安琪
但是他们的系统会有real-time的feedback

不好意思好像就是在pad上切菜

Enabling Nutrition-Aware Cooking in a Smart Kitchen

CHI 2007（大概只能看个概念了）

abstract

目标：health cooking（是不是大家的目标都是那么伟大）
sensors，detect cooking activities, and digtail feedback

intro

主要目标就是如果人加东西假的过量了或者怎么着，就会提醒

Smart Kitchens for People with Cognitive Impairments: A Qualitative Study of Design Requirements

CHI-2018

关于mac消去分区

Posted on 2019-04-08

最近刚接手了别人的mbp，他因为装双系统分区之后，windows分区无法消去。
试了一圈之后发现只要再新建一个分区，然后再一起合并就可以了。

关于Python的字典多key，value返回key

Posted on 2019-04-05 | In 编程语言 , Python

多个keys

1	dict = {(key11,key12): value}

大概就是长这个样子的，key的个数多少没有限制，访问value的时候

1	dict[(key11,key12)]

Human Computer Interaction笔记

Posted on 2019-04-04 | Edited on 2019-05-09 | In HCI , 上课笔记

Introduction

50% 出勤和 50% report
koike森森还有assignment

History

1945 Vannevar Bush
- memex的概念，二战末提出了一种信息机器的设想（个人图书馆）
  - 这种机器内部用微缩胶卷（microfilm）存储信息，也就是自动翻拍，可以不断往里面添加新的信息；桌面上有阅读屏，用来放大阅读微缩胶卷；还有许多个按钮，每一个按钮代表一个主题，只要按一下，相应的微缩胶卷就会显示出来。每一个胶卷内部还记录着相关的其他胶卷的编号，可以方便地切换，形成同主题阅读。在Bush博士的设想中，这种机器还可以与图书馆联网。通过某种机制，将图书馆收藏的胶卷，自动装载到本地机器上。因此，只通过这一个机器，就可以实现海量的信息检索。from 百度百科
- as we may think：同时提出了wearable电脑
1946 Eniac
- 第一个计算机
- 没有keyboard和display，只能手动
1951 UNIVAC
- 可以I/O
- 先在纸条上打孔，然后再放进去读
Ivan Sutherland
- 1963 SketchPad，display上面有图像了，并且可以对画面进行操作
- 可以用光笔操作，不是用键盘操作了
- CAD鼻祖
- 第一个VR设备居然也是他做出来的 Sword of Damocles(1968)
Douglas Engelbart
- 居然出现了鼠标，装了rotatory wheels（竖着的那种），可以在两个方向移动（所以装了两个吗？）
Alan Kay
- PC之父
- 1972 Dynabook，card board做的，因为CPU和GPu太大了，屏幕也没有，并不是真的
- 1996年东芝做了个叫这个名字的PC
1973 Xerox Alto
- real working machine
1981 Xerox Star
- 出现了桌面系统
- GUI
Ted Nelson
- Hypertext Editing System -> pen to jumo to another page
Steve Jobs & Bill Atkinson
- HyperCard 1987，同类的信息都在同一张卡上，然后所以的卡都连在一起，可以在卡中间jump。事情就变得非常简单了
Tim Berners-Lee
- father of WWW （World Wide Web）
Richard Bold
- put that there可以用手势交互，语言交互
- 非常大的一个display和projector
Mark Weiser
- father of the concept of Ubiquitous conputing(1991)
- 预言了以后大家家家有电脑
Jaron Lanier
- VPL data glove & HMD 1989，手套里面有纤维

I/O的硬件如何组合 -> 新的HCI方式

design & evaluation

ACM SIGCHI

What’s HCI

CS
design
是一个交叉学科

学会

CHI -> 更注重想法，和转化成实现
UIST -> 更注重implement
IEEE/ACM Ubicomp
CSCM

重要性

保证安全性，提升生活质量
在商业上的产品化

GUI & hypermedia

GUI

CUI -> GUI
CUI:
- I: keyboard
- O: charracters in display
GUI
- I: keyboard + mouse
- O: bitmap in display

desktop <->

like a real office emvironment (metaphor)
- document,folder,trash
- 对于没有用过电脑的人来说非常容易理解
- visualizing to icons
- operating mouse
Jobs居然copy了这个东西
跟现在的也没有太多概念（standard interface for computer)
pros
- visual by icon，因为视觉看出来的东西比较好理解
- direct manipulation interaction
- abstract by folders
cons
- number of icons make user confused
- more computing
- more physical space
- typing is faster with keyboard

其他的一些想法

room metaphor[henderson86]
- different romms for different task
- multiply desk (就像mbp的多个桌面一样)
- based on user studys <- how they use each applications
- 并没有变成主流，哭哭
超整理法 super-organizing metaphor
- 如何整理物理文件
- organize by time, not name
- sequentially, not hierarchically
- implemented by Freeman -> Lifestream

GUI的一个特征WIMP

window, icon, menu, pointer（like mouse）

整体来看
- 苹果把pull down在左上角，因为从左往右拉比较容易
- windows: 因为不想和apple一样所以扔到底下了
difficult for icon
- 比如路上的标志设计的就很迷，大家都不知道是干啥的

direct manipulation

比如在删除东西的时候CHI需要自己输入，但是GUI可以直接拖进trash里面

WYSIWYG

what you see is what you get

PUI?

I: recognization
O: large/ small displays

PUI(perceptual)

GUI -> PUI

PUI: using various input(sensors)
GUI: mouse& keyboard, not intuitive

vision-based HCI

why?

natural & intuitive?
not special device, unwired
multimodel

application

recognition
- detection & recognition
  - object <-(detect) - object detection system(find the thing in the real world)
  - object database <-recognition- objection recognition system(know what it is)
- detect the hand
  - colors（shapes?）
  - infrared camera
    - near infrared -> vedio cameras(capture near infrared light to the object)
    - far infrared -> capture the heat
- hand location(手指在哪，手在哪)
  - hand regions
  - find the centre -> morphlogical operation
  - fingers -> pattern matching(有很多不同的方法)
tracking

gesture recognition

difficult
- segmentation of hands/body -> depth camera
- recognition of 3D pose(occlusion)
  - 如果用户转身了，手会被其他的东西挡住
- detecting begin/end of gestures 一个非常重要的问题！
high-speed gestures (systems)
- pac！pac！
  - as many hands as possible
  - advantages
    - robust against light conditions
    - real-time with 40 people with 2 hands
    - No instruction necessary
- 3D gestures(for navigation in VR)
  - 2 cameras
  - recognise hand shapes
  - pattern classification(NN)

object recognition

tag-based
- pre-registration of objects
- difficult to attached on something(unbralla, glove…)
- unnartual overlook
based on color information(‘1991)
- 3D histogram(RGB)
- translation/rotation invariant(如果图片改变了方向或者变了，但是颜色信息还没变)
- 但是颜色相似的时候没法分辨
PTAM(‘2007)
- recgonize feature positions
- features & markers

gaze recognition

infrared cameras & LEDs
pattern matching
- corners of eyes,mouse,shaping triangle
- -> face direction

4/18

interactive surface

例子
- handheld:phone, tablet
- horizantal: desk
- vertical: wall

digital desk(‘93)

overhead projector + camera + desk
metaDesk(‘97) -> 用两个奇怪的方块，对这个map进行操作

LCD tabletop

LCD -> larger, thinner,lighter,higher resolusion, less expensive
- before that use projectors(dark)
- use as window? real glass is expensive then LCD
principles of polarization
- 滤光吗，两个方向的（偏振片）
- 这样可以用来检测手，把手之后的背景光滤掉
- 非常好用
  - 可以用来检测手
- AR marker这样的东西实在是太丑了
  - design invisiable markers
  - 把偏振光片减成了ar marker的样子，人看不到但是机器可以识别

background & motivation

traditional surfaces are planar & regid
- difficult to make 3D surface
- photoelasricity -> 透明的材料对不同载荷下颜色不同 -> 也可以用来作为影响偏振光的因素，可以用来按，按下去光就能过去惹

electrical shock

为什么会有这种电人的display啊！（BIRIBIRI）

beyond 2D surface

Caytrick surface(‘18)

4/22 information visualization

更快，更精准的理解info(shape/ colors -> information)

SciVis & InfoVis

Sci
- 用户比较专业
- 用来理解专业的现象
- physical data, measured data, simulation data
Info
- abstract data
- 给人民群众看的，感觉更加直观
how to layout the data

three issues

scalability
- 如果我们想要vis info，如果data的量太大了，有些东西看起来就很复杂(eg.trees)
limited display size
human cant understand

tech

layout
scalability
filiter

layout

graph drawing
- 好看，economically（多级化的，中心辐射，引力型 -> 不同关系的相斥，圆环状）
tree structure
- TreeMap

5/9 Cognitive process

why important

看到的不一定就是真实的
通过改变HCI，可以改变看到的东西的

Seven Stage Model

Xcode的breakpoint1.1问题以及打开摄像头

Posted on 2019-04-04 | In IDE

最近上课又要捡起来c++了，半年前才换的mac用xcode没vs顺手，好几次遇到了挺神奇的错误。

关于SVM的理解

Posted on 2019-04-03 | Edited on 2019-04-04 | In 图像处理 , Machine Learning , SVM

本文参考支持向量机通俗导论内容

SVM到底是啥

support vecttor machine，比如在二维平面上，要把一个东西分成两类，SVM就是平面上的一条直线，并且在这两类的正中间，离两边一样远。换句话说，学习策略是把间隔最大化，从而得到凸二次规划问题的解（虽然不是很理解，但是凸问题应该是比较好求解）

分类标准 logistic regression

线性分类器：x表示数据，y表示类别，分类器则需要在n维找到一个超平面hyper plane，超平面的方程就是W.T
也就是 W.T.dot(x) + b = 0

(令人震惊w居然是超平面的方程)

逻辑回归

逻辑回归就是从特征里面学到一个0/1的分类模型
模型的线性组合作为自变量，取值范围是负无穷到正无穷，所以使用logistic函数（竟然就是si
moid函数把他们投影到（0，1）上面，得到的值就是y = 1的概率
线性分类器如果把分类的两类改成 -1和1（只是为了方便选了这个数字），其实就是把wx加了b
这时候的点的位置可以用 f(x) = wx + b表示，如果f(x)等于0，那么这个点在超平面上，如果大于0就是在1的类型里，小于0在-1的类型里
这时候问题变成了寻找间隔最大的超平面

function margin，geometrical margin

函数距离

当平面上的点是 wx+b = 0 确定了以后， wx+b的绝对值就是点x到超平面的距离
同时 wx+b 的符号和 y（分类标签）的符号对比，如果一致的话是一个类别，不一致的话是另一个 -> y(wx+ b)的正负来表示分类的正确与否（也就是两个东西同号得正分类正确）
引出函数间隔的定义（这里的y是乘上对应类别的y，所以能得到绝对值）

在训练集中，所有点到超平面的距离的最小点就是function margin

几何距离

但是如果单纯这么评定，当w和b成比例改变的时候，函数间隔也会改变，所以还需要几何间隔

上面的式子乘以y（对应类别的标签）就可以得到绝对值了。

也就是说几何margin的主要部分就是把之前的内容除了一个w的范数，变成了标准化之后的长度

最大间隔分类器 max margin classifier

对于一组数据来说，超平面和数据点的距离越大，这个数据的分类确信度（confidence）就越高

最大的间距的目标函数即： max\gama，其中gama是比所有其他间隔都短的函数间隔
如果让最小的函数间隔等于1（为了方便计算），然后求几何间隔，可以得知需要的目标函数变为最大化 1/||w||，其中w是超平面

深入SVM

线性可分和不可分

原始问题和对偶问题duality

之前的目标函数是 1/||w||，所以求这个的最大值，就是求1/2*||w||^2的最小值（这里求最大值就是求倒数的最小值，然后1/2和平方都是为了方便加的）
目标函数变成二次的，约束条件是线性的，凸二次问题，可以用QP（一个写的差不多的包） -> 目标最优的时候loss
由于这个问题的结构，可以转换成对偶问题求解
- 给每一个约束条件加上一个拉格朗日乘子 alpha
- 把这个融合进入目标函数里面
- 当所有的约束条件都满足的时候，目标函数的结果就是之前需要求的目标函数。
- 再对这个目标函数（新的）求最小值，得到的结果就是本来需要求的最小值
- 最后，因为上面的问题不是很好求解，把它的max和min交换了一下，先求所有的间隔的最小值，然后再求这里面alpha条件可以满足的最大值，这两个问题就是对偶问题
- d <= p ，在某些条件满足的情况下这两个值相等，这时候求出来对偶问题就可以求出来原始问题的解
转换对偶问题的原因：
- 对偶问题更容易求解
- 可以引入核函数，这样可以直接引入非线性问题

K.K.T条件

上一段说的，满足对偶问题的解等价的条件就是KKT条件
KTT条件的意义：非线性规划问题（nonlinear processing）能有最优化解法的充要条件
这部分没有写证明，但是上面的求最值的问题可以被证明是满足KKT条件的问题，所以可以用解决对偶问题的方式来求解。

对偶问题的求解步骤

参考内容

CS231Nassignment1之Image Features

Posted on 2019-04-03 | Edited on 2019-04-05 | In 图像处理 , Deep Learning , CS231n作业

目标

之前实现的都是写好了一个linear classifier然后直接对输入图片的raw pixel进行分类
这部分是先从raw data得到相应的图片特征，然后再对特征进行分类

RUOPENG XU

目标

整体步骤

Choptop: An Interactive Chopping Board

abstract

Intro

思路

related work

Design

硬件

力量传感系用

UI

user study

future

CookTab: Smart cutting board for creating recipe with real-time feedback

abstract

intro

related work

Enabling Nutrition-Aware Cooking in a Smart Kitchen

abstract

intro

Smart Kitchens for People with Cognitive Impairments: A Qualitative Study of Design Requirements

多个keys

Introduction

History

design & evaluation

What’s HCI

学会

重要性

推荐书

GUI & hypermedia

GUI

desktop <->

其他的一些想法

GUI的一个特征WIMP

direct manipulation

WYSIWYG

PUI?

PUI(perceptual)

GUI -> PUI

vision-based HCI

why?

application

gesture recognition

object recognition

gaze recognition

4/18

interactive surface

digital desk(‘93)

LCD tabletop

background & motivation

electrical shock

beyond 2D surface

Caytrick surface(‘18)

4/22 information visualization

SciVis & InfoVis

three issues

tech

layout

5/9 Cognitive process

why important

Seven Stage Model

SVM到底是啥

分类标准 logistic regression

逻辑回归

function margin，geometrical margin

函数距离

几何距离

最大间隔分类器 max margin classifier

深入SVM

线性可分和不可分

原始问题和对偶问题duality

K.K.T条件

对偶问题的求解步骤

参考内容

目标