Archive for August 2007

YAPC::Europe: Thursday

it’s just my notes, modified for obvious spelling errors fixes and URLs for the interesting bits. It may contains errors. I’ll post proper and scoped articles later.

Unicode (by Juerd Waalboer)

characters
are not bytes

in 8 bit encoding, one char mas to
one byte
that means you can have at most 256 diff values
enough
for roman and Russian
enough for roman alphabet and Greek
not
enough for roman and Russian and Greek

multi
byte encodings

more bytes => more
characters
fixed width, variable width
unicode
encodings are all multi-byte
UTF-8 is very popular on the
Internet
UTF-16 is the internal encoding in MS Windows

 *
“Character Set”
character set is character <->
number
Unicode is a charset
encoding is number
<-> bytes
UTF-8 is an encoding
MIME
calls them both “charset”
Perl calls them both “encoding”

2
kinds of strings:
perl has one string type
the
universe has several
“text string” and “binary string”
a.k.a
“character string” and “byte string”
the computer doesn’t
know the diff
you should know

*
Unicode perl

text strings are unicode strings,
not UTF-8
ISO-8859-1 maps to 0..255, useful!
perl
keeps stings at ISO-8859-1 as longs as possible
if that
doesn’t work, it upgrades to UTF-8 internally
if you mix the
two kinds, UTF-8 wins.

Prime rule:
Do
not mix byte strings with text strings
except if you
explicitly convert between them

decoding: bytes
-> characters (binary to text)
encoding:

first
slide:
All communication with the “outside world” is in bytes
something
has to decode their binary input to text
something has to
encode your text output to binary

read input
decode
input
process data
encode output
write
output

Neat trick:

Perl
lets you use code points (character numbers)
that do not yet
officially exist

In practice part 1:

use
Encode;

my $text = decode(“UTF-8″,$binary_input)
;
my $output = encode(“UTF-8″,$text) ;

you
have to encode otherwise the output will be character not binary

part
2:
let perl do the hard work!

binmode
STDIN, “:encoding(ISO-8859-1)” ;
binmode STDOUT
“:encoding(“UTF-8″)” ; # don’t forget the hyphen

print
while <> ;

Unicode semantics

perl
has unicode semantics
lc, uc,lcfirst, ucfirst
case
insensitivity
character classes like \w
perl also
has ASCII semantics :(
hard to tell which semantics will be
used for some operation
utf8::upgrade($your string) to
ensure Unicode semantics

in perl
5.9.5: perlunitut
in perl 5.9.5: perlunifaq
http://juerd.nl/site.plp/perluniadvice

Don’t
use encoding.pm
it is broken and cannot be fixed. Using i
will hurt.

encoding::stdio

http://juerd.nl/perlunitut.html

http://www.cafepress.com/perl5/

Making
of ibeatgarry.com (by Karlheinz Zoechling)

Garry
Kasparov

The
oracle of Bacon at Virginia

do the
same with chess instead of movies
garry kasparov instead of
Kevin Bacon
The question: How many hops are needed to defeat
Garry at least transitively

data source:

Chessbase
Megabase 2005 (2007)
3 507 786 chess games
proprietary
data format
but can export to PGN (portable game notation,
Clear text format)

problem 1: max export is 2gb
files -> need split the export

Chess::PGN::Parse
from
PGN to PostgreSQL
ID is created

most
logic in sql

206 650 players

draw
game are discarded
short games less than 5 moves are
discarded too (defaulted games, drunk players, other silly stuff …)

discard
games that aren’t tournament games
leaves 2 385 622 games

->
graph problem -> don’t know graph theory -> CPAN!
->
Shortest Path problem: Graph::*
-> interesting:
Dijkstra algorithm
-> said to be inefficient for
graphs with ends of equal length edges
-> finding:
seems to be true, long wait
-> rumor: Breadth first
search should be the best
-> No breadth first in CPAN
->
rolls his own
-> first use hashes
->
inefficient for graphs with ends of equal length edges
->
use array (improve performances by order 1 of magnitude)

->
approaching 2 pi

-> his garry kasparov
number is 4

problem for making web site
18
m Storable takes seconds to freeze

-> put
the graph into RAM : mod_perl

performance: 0.1s
average per query (array version): not too bad

not
good: as nb of instances increases, RAM usage explode
->
didn’t find a way to share the graph across children

other
problem: Player names are not unique in Chessbase

esp.
for game-same player names appear before 1900 and after 1990, this cant
be.

solution: players who have a “gap” in their
playing records for more than 40 years will be treated as 2 (or more)
players. (Assumption)

-> rework tables,
rebuild Storable freeze

-> build caching
into the front end
computing chains takes times so queries
are stored in a table when they appear for the first time
(added
benefit: data for statistics)

-> redirect
uncached queries to backend

-> fill the
cache with “Kasparov queries”, for a head start

can
link everyone to everyone

Zoechling, Karlheinz
Anderssen,
Adolf is the 1st world champion

Anatoly karpov
has a kasparov number of 1 and a Bacon number of 2 !

hits:
couple of thousands a day

Building
Scalable Data Collection (by mock)

huge quantity
of data from Akamai from various source, sometimes all at once

cron
based db insertion sucks
insert email

steal
good ideas from

perlbal, memcached, mogilefs,
db shards

glue together with POE the wrong way

from
Akamai up into db and mogileFS

scalable fast
architecture

queue-> reader ->
storage

larger lumps of data are faster to
process and transport

MogileFS data store

distributed
load balanced storage
uses mysql – too many inserts is bad

JSON as
compromise record encoding
aggregate data in large gzipped
files
index position of records in sql db

(JSON
access is fast)

2-3 months of data -> 60GB

db
reads scale with clusters
but db writes don’t scale with
clusters

solution -> DB shards

mock
modified DBIx::Class
to work with sharded databases (not yet on CPAN, but its planned)

other
implementations:

Apache/mod_perl (faster in some
way but doesn’t handle loads of transactions very well)
Event::Lib
(not mature)

issue of asynchronous work flow
-> need locking


mogilefs
:
weirdness
with small records
not that fast with writes

Akamai:
services to push back data to content provider

pre
sharded version of pgsql

commercial alternative:
Sybase
IQ

all the nodes are load-balanced
with perlbal

mail:
mock@obscurity.org
web: http://sketchfactory.com

use
JSON::XS
(doesn’t like unicode)

Perl
sucks and what to do about it (by Mark Fowler)

*
Installing perl program is hard

-> PAR

perl
-MCPAN -e ‘install PAR::Packer’

pp -o hellow
hellow.pl

exec time
perl 0.35s
par
0.60s

->alternative -> build own
perl and ship it with the app
-> problem when moving
to a different machine (paths are hard coded so are different)
->
bleed to the rescue

when config perl add
-Duselocableinc

 * perl exception
handling

 - die means die not capture
exception
 - eval
 -
if(blessed($@)  &&
$@->isa(“NoCheeseException”){

 }

try
{
       
throw NoCheeseException “redo”;
}
catch
NoCheeseException with {

}

above
is perl code

(see Error.pm)

->
problem (same as with eval)

in try{
       
return “this doesn’t return from foo”;
}

replace
return by rreturn

and add return allowed after
the catch

 * I hate the way perl
programs are just script

Template Toolkit tpage
solution
1: source filter

solution : build your own
executable

 * I want to
programmatically manipulate my code

 PPI
 -
cant tell the diff between certain perl constructs (like subroutine
prototypes)
 but reliable
 
 MAD

when
config perl
-Dmad=y

 B::Generate
 can
be used to created opcode

 optomize.pm

 *
real prog language can do compile time checking

 use
typesafety;

 typesafety::check()

Perl
worst practices
——————–

Good
Perl
 * easy to read
 *
beautiful
 * useful

Bad Perl
 *
difficult to read
 * ugly
 *
useful
 * fragile

I don’t
like java
java was designed for stupid people…
…but
you don’t need be stupid to use java

examples of
good java made by smart people: lucene, eclipse

I
like perl
perl was designed for smart people …

but you don’t need to be smart to use perl!

Slacks
Law

95% of all the people you meet are stupid

Lies,
damned lies

The problems

3
big problems

variables, regexp, OO

 *
evil variables
1. global
2. package
3.
local
4. my ?

my variables are ok in
a small scope

magic global variables

$txt
= /(\w+):(\w+)/ ;
check_name($1) ;
add_aut($2) ;

 *
regexp

 simple components
 complex
machine

 simple mistakes
 (regexp
injections)
 simple solutions
 
use eq, substr, index, unpack

 complex
mistakes
 regexp evolution
 !DIY
 use
Mail::RFC822::Address;
 use
Regexp::Common;

 *
Object Orientation

 OO is evil
 an
object is just a variable that think they are smart
 slow
 ugly
 too
many ways to do it
 multiple inheritance !

 do
you really need OO?

 POO = Perl Object
Orientation

YAPC::Europe: Wednesday

it’s just my notes, modified for
obvious spelling errors fixes and URLs for the interesting bits. It may
contains errors. I’ll post proper and scoped articles later.

AntiSocial Perl (Damian
Conway)

rod logic

Rod::Logic
(unfortunately not in CPAN :-(   ;-)

Quantum
mechanics + Special relativity

dirac equation
final
diagram of Feynman

positron travels back in time

positronic
program

Positronic::Variables (unfortunately not
in CPAN :-(   ;-)

Deutsch’s CTC (closed
time-like curves)

Test::Harness 3.0
(Curtis Poe)

TAP::Parser
will become T::H 3.0
dev release next week

*
TAP
Version 13 or 14 of TAP
TAP version 1,
January 30 1988

July 8 1996, version 5, all non
tap ignored
Bail out!
v13: understands TAP
version syntax

* TAP Parsers
runtests
gets this right. prove does not

Test::Harness
issues
v difficult to upgrade TAP
difficult to
provide alternative view
confused with incorrect test counts
difficult
to track down skip and todo
multi language tests in suite
difficult

why not refactor T::H?
->
20 years of cruft
-> several people have tried and
failed
-> dangerous to break the tool chain

design
goals:
backward compatible
runs on perl 5.005
non
non-core modules
runs everywhere T::H does
MVC
no
bugs
support new TAP versions

support
multiple languages test using drivers program

todo:
*improve
coverage (btw, theres a bug in Devel::Cover)
*optimize
(optimized runtests catching up with prove but return so much more
information)

future plans:
parallel
test runs
GUI and HTML views
improved diagnostics
via a yaml subset
repeatable shuffles
runtime env
description

who’s using it
Yahoo!
(tagging of the tests)
xmms2 (multi languages tests)
Smolder
(run locally, display remotely)

problems with Test::TAP::HTMLMatrix
(internals is yaml, not xml, no good for document which test reports
are)

Automated Testing of
Open Source software (Gabor Szabo)

Gabor
Szabo

CPAN::Forum

test
automation

QA day:
* TAP
*
FIT
* Selenium
*
Automation in OSS <- subject of the talk

Business
value
* reduce feedback cycle
* continuous builds
*
automated smoke (regression) tests

* report
generation
* overview
* current status
*
drill down to see where did something break

*
accountability

companies VS open source
limited
budget for QA – no paid QA people
market pressure releasing
buggy soft = release often, release soon

open
source:
 test locally, report remotely
 security
consideration by downloading software

 szabgab.com

 * 
perl 5 development:
 Perforce
 RT
 rsync
to get source
 commit msg in mailing list
 TAP
 Smoke
(C compiler, Working perl, Test::Smoke)
 db.test-smoke.org
(not updated any more)
 www.test-smoke.org

centralization
or decentralization of smoke testing

perl 5:
easy participation

 * Parrot testing

multi
language testing (perl, PASM,PIR)
 smoke: use TAP
and Test::TAP::HTMLMatrix
(will be replaced by Smolder)

 * pugs
 subversion
and SVK
 Needs
(Glasgow Haskell Compiler), Perl and Test::TAP::HTMLMatrix

 
* CPAN

       
CPANPLUS
+ Test::Reporter

       
easier is : CPAN + CPAN::Reporter

 *
SQLite
  CVS, tests written in C and TCL
       
very good coverage (98%)
   
    no automated smoke testing
       
CVS HEAD is currently
broken

        

       
* NUT – Network UPS tool
       
use BuildBot for automated build
       
no automated test!
       
need the device to be tested
       
the system might shut down during test

       
* Ruby
       
use subversion
       
unit tests written in Ruby
       
rubinius has separate test suite
       
no automated smoke testing

       
* PGSQL
        
test suite: home grown perl scripts
        
long and frightening list on how to setup … but is easy
        
need registration

How to
find vulnerabilities in perl code (mock)

10k
modules on CPAN
500k from lang:perl on google code search

anatomy
of a vulnerability
user manipulatable
causes harm
usually
found in the boundaries between systems
(perl/sql, perl/web,
perl/fs, perl string/unicode)

sql injection
xss
Flash
cross-domain-policy

google.com/codesearch/

lang:perl
open\s+[A-Z0-9]+,\s*\”.*\$
gives > 19k results

App::Ack,
App::Grepl

lang:perl
(SELECT|DELETE).*FROM.*=\s*’?[\$\@]

methodology:
find
harm and also find something to manipulate
you can
manipulate:
content (taint mode protect against this)
structure
race
conditions (difficult to find and rarely manipulatable)
predictable
state
data leakage

any variable in a
template is potentially a XSS

stompy
- a tool to detect bad prngs
http://lcamtuf.coredump.cx/stompy.tgz

SideJacking
- is your session encrypted, or just your login
http://www.erratasec.com/sidejacking.zip
/

Fuzzing

PeachFuzz
http://peachfuzz.sourceforge.net

Follow
the data flow from user manipulatable input to causing harm

don’t
forget XS

http://sketchfactory.com

Introduction
to Moose (Stevan Little)

use Moose
imports:
 * keywords has, extends, with, before,
after, around, super, override, inner, augment
 *
use strict and use warnings
 * Carp::Confess and
Scalar::Util::Blessed

 no moose ; 1;

Moose::Util::TypeConstraints

pseudo
typing for perl5 -> its actually a validator

->meta
returns meta class
metaclass defines the class
metaclass
is itself an instance of a metaclass

its for
 *
introspecting
 * modify classes (add/remove method,
add/remove attributes)
 * programmatically create
classes

attribute delegation

type
constraints unions

type coercions
 *
create subtype
 * add coerce attribute
 *
use coerce to precisely coerce (what and how) data

 Benefits
of Moose
  * code is less tedious
          
* no need to worry about basic mechanics of OO likes
                  
* object initialization
                        
* object destruction
                        
* attribute storage, access and initialization
               
* less tedium means many typo errors are all but eliminated
       
* code is shorter
               
* Moose declarative style allows you say more with less
               
* less code == less bugs
       
* less low-level testing needed
          
* no need to verify things which are covered by Moose test suite (3k
tests)
 * code becomes more descriptive (code is
documentation)

 Drawbacks:
 
* has fairly heavy compile time cost
         
* not good for non-persistent environments
               
* looking to use .pmc to reduce this burden
       
* some Moose features are slow at times
         
* speed is directly proportional to the amount of features used
       
* Extending non-hash based classes is tricky
         
* e.g: IO::* (use Class::InsideOut
or Object::InsideOut
or use delegation)

Matt Trout is hacking the
lexer to lift some subroutines from compile time to runtime ( or the
other way round, cant remember what he said)

Role
system is very inefficient at the moment

Kwalitee
(Xavier Caron)

definition attempt:
 *
approx of “Quality”
 * confidence
  
* through passing tests, but thats not enough
        
* but correlation exists if there is functional test coverage
        
* bug = diff between expectation and implementation
        
* bug = diff between test, documentation and code
        
* you tend to the goad, but you wont reach it

        
* ages before
         
* literature
               
* CPAN
               
* articles, conferences,
               
* Read, learn, evolve
       
* before
        
* generate skeleton
        
* write tests ( a tad of XP)
       
* while
       
* after
         
* test
                 
* measure pod coverage
                       
* measure tests code coverage
                       
* measure func test coverage
               
* generate synthetic reports
       
* way after (release)

       
“Always code as if the guy who ends up maintaining your code will be a
violent psychopath who knows where you live” Damian Conway

       
SICP’s
preface
:
       
“Thus, programs must be written for people to read, and only
incidentally for machines to execute.”

       
* Pre requisites:
        
* version control
        
* version control standards
        
* coding standards
        
* ticket tracker
        
* text editor or IDE

       
* do not reinvent the wheel – avoid repeating others errors
       
* use CPAN
       
“I code in CPAN, the rest is syntax.” – Audrey Tang

       
programmers triptych

       
pod (hubris)
       
tests (laziness)
       
code (impatience)

       
At the beginning
        
file tree structure
       
Use a dedicated CPAN module
        
Module::Starter ( or Module::Starter::PBP)

       
Testing for dummies
       
test = confront intention * implementation
       
using techniques (directed or constrained random test)
       
and a reference model (OK ~ no <> vs reference)

TDD
test
suite ~ executable specification

“old tests
don’t die, they just become non-regression tests!” chromatic &
Michael G Schwen

tester:
“is this
correct?”
“Am I finished?”

code
coverage <> functional coverage

how
do I measure functional coverage in perl?

HDVL
there is SystemVerilog

for perl: Test::LectroTest

TAP

skip:
because external factor
todo: not yet implement

CPANTS
define
kwalitee
metrics
(13)

assertions

“dead
programs tell no lies” Hunt and Thomas, Pragmatic programmer

Test::LectroTest

most
test are directed

an alternative is “constrained
random testing”
let the machine do the dirty job instead
(pseudo) randomly (like in hardware testing)
-> use
Test::ElectroTest module
 -> stick a type to
each function parameter
 -> add constraints
to parameters (i.e restrain to subsets)

refactor
early, refactor often
(on feature branches)

there
is technique and there is commitment

“At that
time [1909] the chief engineer was almost always the chief test pilot
as well. That had the fortunate result of eliminating poor engineering
early in aviation.” igor sikorsky

High
Order Parsing in perl (Mark Jason Dominus)

Parsing
= unstructured -> data structure

closed
vs open system

open system
 +
flexible, powerful, unlimited
 _ require more
understanding

 Parse::RecDescent
is a really excellent closed system
 open system : HOP::Parser

example:
web app where user input is math function
we want a graph
out of it
easy solution: use eval to run user input
into compiled perl code
 cangowrong:
       
* input is “rm -rf”
       
* in perl ^ means bitwise exclude but not exponentiation
       
* …
alternative: implement an evaluator for expression
 *
input: string
  * output: compiled code or abstract
syntax tree or specialized data structure or expression object or ..

structure
of an expression -> grammars

expression
-> “(” expression “)” | term (“+” expression | nothing)

term
-> factor (“*” term | nothing)

factor
-> atom (“^” NUMBER | nothing)

atom
-> NUMBER (argh!, something’s missing here)

lexing

idea:
preprocess the input
humans do this when they read
 
* first, turn the seq of char into a sequence of words
       
* then try to understand the struct of the sentence based on meanings
of words
       
* this is called lexing

lexing: is mostly matter
of pattern matching

perl actually has special
regex features just for this purpose

tokens

sub
type{}
sub value{}

recursive descent
parsing

idea: each grammar rule becomes a
function

parsers

easy
one: nothing
others: parsers for a specific token

YAPC::Europe: Tuesday

it’s just my notes, modified for
obvious spelling errors fixes and URLs for the interesting bits. It may
contains errors. I’ll post proper and scoped articles later.

Upate: Fixed broken links

Larry Wall’s Keynote

scripting
languages

past, present and future

ruby
most direct competitor for perl

perl 6 mix between
pure scripting and programming language

lua,
applescript: niche player

failed: tcl (due to lack
of extensibility), *sh (clumsy addition of layers of features)

early
binding vs late binding
perl6: all method are virtual by
default

single dispatch, multiple dispatch
single:
python, perl5
multiple: perl6, dylan

eager
or lazy evaluation
haskell: very lazy evaluation
perl6:
scalar is eager by default, list is lazy

eager
typology, lazy typology:
types introduced in perl6 for the
multiple dispatch
fixes e.g: prototyping

removed
from perl6 punctuation that were not really necessary
introduce
a new one for scoping

mutable, immutable classes
java
classes are immutable -> fast
ruby classes are mutable
-> ruby slow
perl6 will have a mix

class
based OO, prototype based OO
perl6 will be classed based, but
meta data will allow prototype based OO
(see Moose in perl5)

perl6:
given … when … for regexp?

Selenium,
an introduction to web testing (Barbie)

benefits over
Mechanize: test javascript as it runs

Thoughworks
released
as open source on openQA

use javascript and iframes
in the browser
core runs the tests and interrogates the DOM
RC
server and core communicate via AJAX

Core, Remote
Control, IDE (firefox plugin)

Core: issues with Opera

RC:
java, requires JRE version 1.5.0 or higher
experimental
support for SSL
language hooks for
java, .Net (C#),
Perl, PHP, Python, Ruby

Mozilla same origin policy

IDE:
record/playback, edit and debug tests
include Selenium Core

cpan>
install Alien::SeleniumRC

(cant
upgrade due to how versions are dealt with)

cpan>
install Test::WWW::Selenium

$
selenium-rc

use WWW::Selenium::Util
qw(server_is_running)

Evolving
architecture – make development easy and your site faster (Leo Lapworth)

evolution

running
website
 * servers improved
 *
architecture too
 * development tools improved
 *
language

  * templates

 hard
coded html in the beginning
 now templates
 Template
 
 
* servers

       
from development on 1 server to 3-tier servers

       
svk with subversion
       
trac

       
* tests

       
* make your site faster

       
mod_perl (code caching)

       
Apache::SizeLimit
(safety net)
       
-> set it high
       
-> check it

  * front/back end split

       
(sees Omnigraffle schema)

       
add caching

       
search result sets

       
individual items
       
lookups for info from database

       
lookup from external sources

       
put caching methods in one packages

       
separate cache for each backend servers?
       
-> share them (using memcached)

       
perlbal
-> load balancer/proxy

       
mod_gzip

       
cache headers (expiry)

       
/includes/js/<version>/common.js -> can be cached
forever

       
ensures user has version which matches html

       
use include file to update all pages

       
* handling images: MogileFS

       
* centralize
       
* test
       
* cache
       
* kiss (esp. perlbal)

XML::Compile::SOAP (Mark Overmeer)

XML
sucks (verbose, looks simple but its not)
 XML schema
 WSDL,
SOAP

 avoid learning XML and Schema

 pure
perl, compliant, complete, validating, xml message reading and writing

 use
XML::Compile::Schema;

 good:
 automatic
name-spaces
 type structures hidden (inheritance etc)
 template
generator

 limitation:
 only
name-space based schemas
 mixed content only via hooks
 schemas
themselves not validated
 you need a schema to use
the module

SOAP (PayLoad – all XML -, Transport – in
application)
Payload = Body + Header ( Envelope)

two
kinds of SOAP:
Document
* well defined body
*
requires longer schemas

XML-RPC
 *
interface quick and dirty
 * SOAP::Lite
special
 * discouraged in SOAP 1.2

WSDL
message
structure and transports details are grouped together.

XML::Compile::WSDL

SOAP
client/server implementation still under construction
 
use
BigInt instead of sloppy int -> slight reduction in performances.

Gluing
a bank together (UBS) (Paul Johnson)

move lots of money around to
avoid interests or to gain interests

Cash management

CPAN

primary
development is outsourced

needs to customize the
product

needs to be integrated

database
web
servers
communications
high availability
monitoring
logging
archiving
deployment

initially
role is automated testing

perl as a development
language is  not allowed in UBS
but perl to glue
thing together is Ok, then development could be done

Oracle
100s
of GB

* Web server
IHS:
IBM re branded version of Apache

* Communication

multiple
sources
multiple format

 - message
transfer: how amount goes from what bank to what other bank in what
currency
IBM MQSeries

 - mail
 -
SMS
 - IRC
 - file transfer

 pack
and unpack

use  Spreadsheet::ParseExcel

use
MQSeries
; # written and maintained by Morgan Stanley people

system
handles many millions of money currency

if system
breaks, huge amount of money is lost

monitoring
-> Nagios

logs

50 GB a day
require application restart to log rotate

so
he write wrappers with named pipes, correct formatting including
timestamps

use Log::Log4Perl

Deployment:

Sun
packages

package creation

mini
CPAN burnt on CD

Extra development

internal
part base on Catalyst
(DBIx::Class,
Template
Toolkit
)

Automated testing

Test::*

use Test::WWW::Selenium

Trexy (Nigel Hamilton)

trexy.com

remember
search trails

my trails – all trails – blaze a trial

30
millions incoming links

Sys::Statistics::Linux::MemStats

pingability.com

webmin

Template::Simple

The
Goo

perceptrons
sensors

http://blog.thegoo.org

Tech
Pub Crawl: first Tuesday of the month in London
flag-and-bell.com
FREE BEER

memcached (Leon Brocard)

network
effect
-> scaling?

temporary storage
area fro frequently accessed data can be stored for rapid access

trade
memory/disk speed

One Server:

MySQL
query cached – invalidated on write

Disk – Cache::FileCache
scales
really well
memory bound

mod_perl
only
one per child

shared memory
not as fast as
you might think

cache is separate on each

lower
hit ratio
higher miss ratio

memcached
 giant
hash table distributed across machines

 never
blocks
 libevent
 epoll/kqueue
 slab
allocator
 least request used
 thread
per cpu (optionally)
version 1.2.x are much better

facebook:
3TB memcached

use Cache::Memcached

Pattern:
fetch
from cache
if there return
       
else calculate, place in cache, return

cache, not a
database
-> cant dump
-> no
persistence
-> no redundancy
-> no
access by id
-> …

time to live

smart
caching
timestamps, version number in key
cache
forever

low CPU

Failover?
doest
do it for you
replace failed server with another with same ip
or
use consistent hashing

limits:
keys: max
250 chars
values: max 1MB

Testing
*
disable memcached

future:

consistent
hashing
binary protocol
more statistics

http://www.danga.com/memcached/

has
to push the keys to all memcached servers

memcached, perlbal, mogileFS,
Djabberd,Gearman

TheSchwartz

Net::Proxy (Philippe Bruhat)

connectors

a
connector handles the pairs of socket (one for each client)

Use?
 *
escape the corporate proxy
CONNECT method
(abuse of
the the CONNECT, normally for SSL?)
 * avoid
Intrusion Detection Systems
  * early stage of ssh
negotiation is not encrypted and can be detected by IDS by doing a
m/ssh/
       
* use hooks to hide ssh signature using one Net::Proxy
before the firewall and another Net::Proxy the other side of the
firewall to decrypt ssh signature
 * add SSL support
to an application that doest support it
 * run two
servers on the same port
   we want to run
sshd and https on the same port
         
* in ssh negotiation, server speaks first
               
* in http/ssl: client speaks first
   
* Net::Proxy uses that to make it possible

 *
todo:
   * write a connector fully
compatible with GNU httptunnel
        
* enhance the httptunnel protocol to support multiple connections.
        
* implement reverse connectors (as you cannot connect to
machines
behind firewalls at the moment)
        
* implement DNS tunnel connectors
        
* implement UDP connectors
        
* implement a connector that can be plugged to the STDIN/STDOUT of an
external process, like the ProxyCommand option of OpenSSH
        
* finish the starttls connector
        
* implement SOCKS connectors