If you’re not sure what’s the difference between a controlled vocabulary, a taxonomy and a thesaurus, this blog article is quite good at clarifying the above concepts and more.
What’s not covered are tags and folksonomy.
If you’re not sure what’s the difference between a controlled vocabulary, a taxonomy and a thesaurus, this blog article is quite good at clarifying the above concepts and more.
What’s not covered are tags and folksonomy.
Interesting article on Boing Boing about Google image labeler and how to entice the mass into a dull job by the means of a game challenge…
It’s a clever way to do retrospective tagging. Organizations with large archives of data are likely to do more of this in order to open access to their “long tail”.
In a non-digital world, an analogy could be the way the BBC organized national “treasure hunts” to retrieve long missing archive material, except they were one-offs.
Google has found a way to harness the power of the mass (and no, this has nothing to do with what happens to humans in the Matrix trilogy, although …)
My team works in short iterations at the end of which we should be able to release a new version of the software with added value.
To facilitate the generation of release notes, another team I was observing is using markups in their version control commit message: they add [R] in the commit message to denote the addition new features to the repository.
I thought it was a good idea and introduced it as a version control standard in my team.
After a couple of iterations, there weren’t many [R] in the commit messages.
We found ourselves sometimes shying away from adding an [R] to the commit message, as we’re not always sure a feature is done or not. Also we do commit often, I’d say compulsively, at various steps of our work: it’s easy to commit the last chunk without realizing it’s the last. It is also easy to mark a commit with [R] on a chunk of code for which some files are accidentally missing from the commit.
After further investigation, I discovered a difference in branching strategy between the two teams:
In my team we branch on ad-hoc basis when we are about to start a risky task, but keep working on trunk for work with limited scope and impact. The other team is systematically creating branches for ANY new work they are doing, which means that merging branches (after running the test suite on that branch) to trunk define what is a completed feature and the [R] mark tends to be found on commit message for merges.
Since we are not applying a branch for every feature policy we need to find other ways to identify task done in an iteration. What we are doing though is acceptance test driven development.
It starts with a user story, then the associated acceptance tests. We use fit to write FIT tables for acceptances and write the fixture code for the automated test. The Fit infrastructure is held in Fitnesse which is a wiki-on-steroids with an integrated FIT runner.
Each user story is en entry in the wiki, and Fitnesse allows you to create virtual wiki links on wiki entries which allow easy creation of indexes (dynamically or statically updated).
We are already using these facilities to regroup all completed and validated stories on a page called RunningTestedFeature which is then executed as a suite as part of our continuous integration process for regression testing.
The interesting bit is that in a similar way, for each iteration we can also create a page linking to all stories planned for that iteration at the end of which the suite is executed . The stories whose acceptances tests passed become the bullet point item in the release note if we decide to release.
And there is an added bonus: by keeping these iteration index pages over time, it will help the calculation of velocity as it becomes easy to count how many stories are completed per iteration.
Can you imagine yourself doing your grocery shopping and when you arrive at the till, you pay only by putting your finger on a finger print reader?
According to this spanish article, there is a trial in Germany that seems successful enough to be extended to hundreds of stores.
Apparently the system is found useful by elderly people who don’t have to memorize pin number, or fiddle with coins and notes.
The drawback is that it is one more company to hold your payment details.
In other hand, being brick and mortar, they are not going to disappear the next day.
Source: faq-mac
In response to Chris’ comment about “OO=Evil” and his perl program that he said would have been easier to change had it been written in Java:
by saying “…much easier to alter if it were written in java.” , I think you meant to say ”much easier if it was properly encapsulated, layered with clearly defined interfaces between components and recognizable abstractions and design patterns”
Object Orientation paradigm is all about that:
The existence of documented and proven design patterns for OO further helps build and refactor OO programs.
These are characteristics of OO programming in general, not specific to a language.
You can write proper Object Oriented Programs in perl (see Damian Conway’s PBP or better, the recent effort around Moose)
You can apply the OO design patterns to such perl programs (e.g: see Object::PerlDesignPatterns)
and they will have the same benefits as properly written Java programs (easy to refactor and to reuse code).
It’s just that:
Regarding the talk on Perl’s Worst Practices and the seemingly controversial “OO=Evil”:
what Mark Fowler meant is that, OO is not only the paradigm in computing and it’s not always the best way to solve a problem.
You’ve got OO programming, procedural programming, functional programming, aspect oriented programming,…
The nature of perl doesn’t force you in any of these (”There’s More Than One Way To Do It”), and each of these paradigms are better than the others for their own class of problems.
In other words, a skilled programmer should use the paradigm that suit best the domain of problems it tries to solve.
Of course, all programmers involved in writing or changing such programs need to have the same understanding of the domain and the best suited paradigm otherwise …
Mark Fowler also hinted at the multiple unsatisfactory ways along with heavy syntax when doing OO with perl 5 as a reason for OO=Evil.
Perl 6 has a new object system which is very good and it is being ported on perl 5 as Moose.
Regarding my views on OO=Evil
I’m a big fan of Object Orientation as implemented with Objective-C or Small Talk.
Java OO is less elegant than
ST/Objective-C’s.
Perl 5 OO implementation upsets me more, but at least I can choose not do OO in perl.
Also, the prospects of Perl 6 and the influence it had on Perl 5 (Moose) is starting to make OO in perl more interesting and desirable.
I think OO is not always the best paradigm, and I’m liking Functional Programming more and more.
Mark Jason Dominus’ High Order Perl is the “FP design patterns” reference for perl programmers.
I found it difficult to get my head around (I’m still in the first chapters of HOP) and also I don’t know
the best practices for unit testing FP programs.
I can restrospectively see a bunch of code of mine that might have benefitted of FP. Conversely, I have also unappropriately applied FP to some code. My current project though is better served by using OO.
Going back to OO, Ruby’s OO is close to Small Talk, and Perl 6 is also borrowing things from Small Talk (ST traits will be known as roles in Perl 6) among other languages.
So, OO in Perl 5 is the devil, but often you have to sleep with it and at the end you get used to it.
If I can identify problems in code I create or change that are better solved by FP (I really need to make progress with HOP), I’ll go for it.
In Perl 6, OO (and FP too) will be dramatically improved.
If you can’t wait, go for Ruby or Perl 5+Moose.
it’s just my notes, modified for obvious spelling errors fixes and URLs for the interesting bits. It may contains errors. I’ll post proper and scoped articles later.
characters
are not bytes
in 8 bit encoding, one char mas to
one byte
that means you can have at most 256 diff values
enough
for roman and Russian
enough for roman alphabet and Greek
not
enough for roman and Russian and Greek
multi
byte encodings
more bytes => more
characters
fixed width, variable width
unicode
encodings are all multi-byte
UTF-8 is very popular on the
Internet
UTF-16 is the internal encoding in MS Windows
*
“Character Set”
character set is character <->
number
Unicode is a charset
encoding is number
<-> bytes
UTF-8 is an encoding
MIME
calls them both “charset”
Perl calls them both “encoding”
2
kinds of strings:
perl has one string type
the
universe has several
“text string” and “binary string”
a.k.a
“character string” and “byte string”
the computer doesn’t
know the diff
you should know
*
Unicode perl
text strings are unicode strings,
not UTF-8
ISO-8859-1 maps to 0..255, useful!
perl
keeps stings at ISO-8859-1 as longs as possible
if that
doesn’t work, it upgrades to UTF-8 internally
if you mix the
two kinds, UTF-8 wins.
Prime rule:
Do
not mix byte strings with text strings
except if you
explicitly convert between them
decoding: bytes
-> characters (binary to text)
encoding:
first
slide:
All communication with the “outside world” is in bytes
something
has to decode their binary input to text
something has to
encode your text output to binary
read input
decode
input
process data
encode output
write
output
Neat trick:
Perl
lets you use code points (character numbers)
that do not yet
officially exist
In practice part 1:
use
Encode;
my $text = decode(”UTF-8″,$binary_input)
;
my $output = encode(”UTF-8″,$text) ;
you
have to encode otherwise the output will be character not binary
part
2:
let perl do the hard work!
binmode
STDIN, “:encoding(ISO-8859-1)” ;
binmode STDOUT
“:encoding(”UTF-8″)” ; # don’t forget the hyphen
print
while <> ;
Unicode semantics
perl
has unicode semantics
lc, uc,lcfirst, ucfirst
case
insensitivity
character classes like \w
perl also
has ASCII semantics
hard to tell which semantics will be
used for some operation
utf8::upgrade($your string) to
ensure Unicode semantics
in perl
5.9.5: perlunitut
in perl 5.9.5: perlunifaq
http://juerd.nl/site.plp/perluniadvice
Don’t
use encoding.pm
it is broken and cannot be fixed. Using i
will hurt.
http://juerd.nl/perlunitut.html
http://www.cafepress.com/perl5/
Garry
Kasparov
The
oracle of Bacon at Virginia
do the
same with chess instead of movies
garry kasparov instead of
Kevin Bacon
The question: How many hops are needed to defeat
Garry at least transitively
data source:
Chessbase
Megabase 2005 (2007)
3 507 786 chess games
proprietary
data format
but can export to PGN (portable game notation,
Clear text format)
problem 1: max export is 2gb
files -> need split the export
Chess::PGN::Parse
from
PGN to PostgreSQL
ID is created
most
logic in sql
206 650 players
draw
game are discarded
short games less than 5 moves are
discarded too (defaulted games, drunk players, other silly stuff …)
discard
games that aren’t tournament games
leaves 2 385 622 games
->
graph problem -> don’t know graph theory -> CPAN!
->
Shortest Path problem: Graph::*
-> interesting:
Dijkstra algorithm
-> said to be inefficient for
graphs with ends of equal length edges
-> finding:
seems to be true, long wait
-> rumor: Breadth first
search should be the best
-> No breadth first in CPAN
->
rolls his own
-> first use hashes
->
inefficient for graphs with ends of equal length edges
->
use array (improve performances by order 1 of magnitude)
->
approaching 2 pi
-> his garry kasparov
number is 4
problem for making web site
18
m Storable takes seconds to freeze
-> put
the graph into RAM : mod_perl
performance: 0.1s
average per query (array version): not too bad
not
good: as nb of instances increases, RAM usage explode
->
didn’t find a way to share the graph across children
other
problem: Player names are not unique in Chessbase
esp.
for game-same player names appear before 1900 and after 1990, this cant
be.
solution: players who have a “gap” in their
playing records for more than 40 years will be treated as 2 (or more)
players. (Assumption)
-> rework tables,
rebuild Storable freeze
-> build caching
into the front end
computing chains takes times so queries
are stored in a table when they appear for the first time
(added
benefit: data for statistics)
-> redirect
uncached queries to backend
-> fill the
cache with “Kasparov queries”, for a head start
can
link everyone to everyone
Zoechling, Karlheinz
Anderssen,
Adolf is the 1st world champion
Anatoly karpov
has a kasparov number of 1 and a Bacon number of 2 !
hits:
couple of thousands a day
huge quantity
of data from Akamai from various source, sometimes all at once
cron
based db insertion sucks
insert email
steal
good ideas from
perlbal, memcached, mogilefs,
db shards
glue together with POE the wrong way
from
Akamai up into db and mogileFS
scalable fast
architecture
queue-> reader ->
storage
larger lumps of data are faster to
process and transport
MogileFS data store
distributed
load balanced storage
uses mysql - too many inserts is bad
JSON as
compromise record encoding
aggregate data in large gzipped
files
index position of records in sql db
(JSON
access is fast)
2-3 months of data -> 60GB
db
reads scale with clusters
but db writes don’t scale with
clusters
solution -> DB shards
mock
modified DBIx::Class
to work with sharded databases (not yet on CPAN, but its planned)
other
implementations:
Apache/mod_perl (faster in some
way but doesn’t handle loads of transactions very well)
Event::Lib
(not mature)
issue of asynchronous work flow
-> need locking
mogilefs:
weirdness
with small records
not that fast with writes
Akamai:
services to push back data to content provider
pre
sharded version of pgsql
commercial alternative:
Sybase
IQ
all the nodes are load-balanced
with perlbal
mail:
mock@obscurity.org
web: http://sketchfactory.com
use
JSON::XS
(doesn’t like unicode)
*
Installing perl program is hard
-> PAR
perl
-MCPAN -e ‘install PAR::Packer’
pp -o hellow
hellow.pl
exec time
perl 0.35s
par
0.60s
->alternative -> build own
perl and ship it with the app
-> problem when moving
to a different machine (paths are hard coded so are different)
->
bleed to the rescue
when config perl add
-Duselocableinc
* perl exception
handling
- die means die not capture
exception
- eval
-
if(blessed($@) &&
$@->isa(”NoCheeseException”){
}
try
{
throw NoCheeseException “redo”;
}
catch
NoCheeseException with {
}
above
is perl code
(see Error.pm)
->
problem (same as with eval)
in try{
return “this doesn’t return from foo”;
}
replace
return by rreturn
and add return allowed after
the catch
* I hate the way perl
programs are just script
Template Toolkit tpage
solution
1: source filter
solution : build your own
executable
* I want to
programmatically manipulate my code
PPI
-
cant tell the diff between certain perl constructs (like subroutine
prototypes)
but reliable
MAD
when
config perl
-Dmad=y
B::Generate
can
be used to created opcode
optomize.pm
*
real prog language can do compile time checking
use
typesafety;
typesafety::check()
Perl
worst practices
——————–
Good
Perl
* easy to read
*
beautiful
* useful
Bad Perl
*
difficult to read
* ugly
*
useful
* fragile
I don’t
like java
java was designed for stupid people…
…but
you don’t need be stupid to use java
examples of
good java made by smart people: lucene, eclipse
I
like perl
perl was designed for smart people …
…
but you don’t need to be smart to use perl!
Slacks
Law
95% of all the people you meet are stupid
Lies,
damned lies
The problems
3
big problems
variables, regexp, OO
*
evil variables
1. global
2. package
3.
local
4. my ?
my variables are ok in
a small scope
magic global variables
$txt
= /(\w+):(\w+)/ ;
check_name($1) ;
add_aut($2) ;
*
regexp
simple components
complex
machine
simple mistakes
(regexp
injections)
simple solutions
use eq, substr, index, unpack
complex
mistakes
regexp evolution
!DIY
use
Mail::RFC822::Address;
use
Regexp::Common;
*
Object Orientation
OO is evil
an
object is just a variable that think they are smart
slow
ugly
too
many ways to do it
multiple inheritance !
do
you really need OO?
POO = Perl Object
Orientation
it’s just my notes, modified for
obvious spelling errors fixes and URLs for the interesting bits. It may
contains errors. I’ll post proper and scoped articles later.
rod logic
Rod::Logic
(unfortunately not in CPAN
Quantum
mechanics + Special relativity
dirac equation
final
diagram of Feynman
positron travels back in time
positronic
program
Positronic::Variables (unfortunately not
in CPAN
Deutsch’s CTC (closed
time-like curves)
TAP::Parser
will become T::H 3.0
dev release next week
*
TAP
Version 13 or 14 of TAP
TAP version 1,
January 30 1988
July 8 1996, version 5, all non
tap ignored
Bail out!
v13: understands TAP
version syntax
* TAP Parsers
runtests
gets this right. prove does not
Test::Harness
issues
v difficult to upgrade TAP
difficult to
provide alternative view
confused with incorrect test counts
difficult
to track down skip and todo
multi language tests in suite
difficult
why not refactor T::H?
->
20 years of cruft
-> several people have tried and
failed
-> dangerous to break the tool chain
design
goals:
backward compatible
runs on perl 5.005
non
non-core modules
runs everywhere T::H does
MVC
no
bugs
support new TAP versions
support
multiple languages test using drivers program
todo:
*improve
coverage (btw, theres a bug in Devel::Cover)
*optimize
(optimized runtests catching up with prove but return so much more
information)
future plans:
parallel
test runs
GUI and HTML views
improved diagnostics
via a yaml subset
repeatable shuffles
runtime env
description
who’s using it
Yahoo!
(tagging of the tests)
xmms2 (multi languages tests)
Smolder
(run locally, display remotely)
problems with Test::TAP::HTMLMatrix
(internals is yaml, not xml, no good for document which test reports
are)
Gabor
Szabo
CPAN::Forum
test
automation
QA day:
* TAP
*
FIT
* Selenium
*
Automation in OSS <- subject of the talk
Business
value
* reduce feedback cycle
* continuous builds
*
automated smoke (regression) tests
* report
generation
* overview
* current status
*
drill down to see where did something break
*
accountability
companies VS open source
limited
budget for QA - no paid QA people
market pressure releasing
buggy soft = release often, release soon
open
source:
test locally, report remotely
security
consideration by downloading software
*
perl 5 development:
Perforce
RT
rsync
to get source
commit msg in mailing list
TAP
Smoke
(C compiler, Working perl, Test::Smoke)
db.test-smoke.org
(not updated any more)
www.test-smoke.org
centralization
or decentralization of smoke testing
perl 5:
easy participation
* Parrot testing
multi
language testing (perl, PASM,PIR)
smoke: use TAP
and Test::TAP::HTMLMatrix
(will be replaced by Smolder)
* pugs
subversion
and SVK
Needs
(Glasgow Haskell Compiler), Perl and Test::TAP::HTMLMatrix
* CPAN
CPANPLUS
+ Test::Reporter
easier is : CPAN + CPAN::Reporter
*
SQLite
CVS, tests written in C and TCL
very good coverage (98%)
no automated smoke testing
CVS HEAD is currently
broken
* NUT - Network UPS tool
use BuildBot for automated build
no automated test!
need the device to be tested
the system might shut down during test
* Ruby
use subversion
unit tests written in Ruby
rubinius has separate test suite
no automated smoke testing
* PGSQL
test suite: home grown perl scripts
long and frightening list on how to setup … but is easy
need registration
10k
modules on CPAN
500k from lang:perl on google code search
anatomy
of a vulnerability
user manipulatable
causes harm
usually
found in the boundaries between systems
(perl/sql, perl/web,
perl/fs, perl string/unicode)
sql injection
xss
Flash
cross-domain-policy
lang:perl
open\s+[A-Z0-9]+,\s*\”.*\$
gives > 19k results
lang:perl
(SELECT|DELETE).*FROM.*=\s*’?[\$\@]
methodology:
find
harm and also find something to manipulate
you can
manipulate:
content (taint mode protect against this)
structure
race
conditions (difficult to find and rarely manipulatable)
predictable
state
data leakage
any variable in a
template is potentially a XSS
stompy
- a tool to detect bad prngs
http://lcamtuf.coredump.cx/stompy.tgz
SideJacking
- is your session encrypted, or just your login
http://www.erratasec.com/sidejacking.zip
/
Fuzzing
PeachFuzz
http://peachfuzz.sourceforge.net
Follow
the data flow from user manipulatable input to causing harm
don’t
forget XS
use Moose
imports:
* keywords has, extends, with, before,
after, around, super, override, inner, augment
*
use strict and use warnings
* Carp::Confess and
Scalar::Util::Blessed
no moose ; 1;
pseudo
typing for perl5 -> its actually a validator
->meta
returns meta class
metaclass defines the class
metaclass
is itself an instance of a metaclass
its for
*
introspecting
* modify classes (add/remove method,
add/remove attributes)
* programmatically create
classes
attribute delegation
type
constraints unions
type coercions
*
create subtype
* add coerce attribute
*
use coerce to precisely coerce (what and how) data
Benefits
of Moose
* code is less tedious
* no need to worry about basic mechanics of OO likes
* object initialization
* object destruction
* attribute storage, access and initialization
* less tedium means many typo errors are all but eliminated
* code is shorter
* Moose declarative style allows you say more with less
* less code == less bugs
* less low-level testing needed
* no need to verify things which are covered by Moose test suite (3k
tests)
* code becomes more descriptive (code is
documentation)
Drawbacks:
* has fairly heavy compile time cost
* not good for non-persistent environments
* looking to use .pmc to reduce this burden
* some Moose features are slow at times
* speed is directly proportional to the amount of features used
* Extending non-hash based classes is tricky
* e.g: IO::* (use Class::InsideOut
or Object::InsideOut
or use delegation)
Matt Trout is hacking the
lexer to lift some subroutines from compile time to runtime ( or the
other way round, cant remember what he said)
Role
system is very inefficient at the moment
definition attempt:
*
approx of “Quality”
* confidence
* through passing tests, but thats not enough
* but correlation exists if there is functional test coverage
* bug = diff between expectation and implementation
* bug = diff between test, documentation and code
* you tend to the goad, but you wont reach it
* ages before
* literature
* CPAN
* articles, conferences,
* Read, learn, evolve
* before
* generate skeleton
* write tests ( a tad of XP)
* while
* after
* test
* measure pod coverage
* measure tests code coverage
* measure func test coverage
* generate synthetic reports
* way after (release)
“Always code as if the guy who ends up maintaining your code will be a
violent psychopath who knows where you live” Damian Conway
SICP’s
preface:
“Thus, programs must be written for people to read, and only
incidentally for machines to execute.”
* Pre requisites:
* version control
* version control standards
* coding standards
* ticket tracker
* text editor or IDE
* do not reinvent the wheel - avoid repeating others errors
* use CPAN
“I code in CPAN, the rest is syntax.” - Audrey Tang
programmers triptych
pod (hubris)
tests (laziness)
code (impatience)
At the beginning
file tree structure
Use a dedicated CPAN module
Module::Starter ( or Module::Starter::PBP)
Testing for dummies
test = confront intention * implementation
using techniques (directed or constrained random test)
and a reference model (OK ~ no <> vs reference)
TDD
test
suite ~ executable specification
“old tests
don’t die, they just become non-regression tests!” chromatic &
Michael G Schwen
tester:
“is this
correct?”
“Am I finished?”
code
coverage <> functional coverage
how
do I measure functional coverage in perl?
HDVL
there is SystemVerilog
for perl: Test::LectroTest
TAP
skip:
because external factor
todo: not yet implement
CPANTS
define
kwalitee
metrics (13)
assertions
“dead
programs tell no lies” Hunt and Thomas, Pragmatic programmer
most
test are directed
an alternative is “constrained
random testing”
let the machine do the dirty job instead
(pseudo) randomly (like in hardware testing)
-> use
Test::ElectroTest module
-> stick a type to
each function parameter
-> add constraints
to parameters (i.e restrain to subsets)
refactor
early, refactor often
(on feature branches)
there
is technique and there is commitment
“At that
time [1909] the chief engineer was almost always the chief test pilot
as well. That had the fortunate result of eliminating poor engineering
early in aviation.” igor sikorsky
Parsing
= unstructured -> data structure
closed
vs open system
open system
+
flexible, powerful, unlimited
_ require more
understanding
Parse::RecDescent
is a really excellent closed system
open system : HOP::Parser
example:
web app where user input is math function
we want a graph
out of it
easy solution: use eval to run user input
into compiled perl code
cangowrong:
* input is “rm -rf”
* in perl ^ means bitwise exclude but not exponentiation
* …
alternative: implement an evaluator for expression
*
input: string
* output: compiled code or abstract
syntax tree or specialized data structure or expression object or ..
structure
of an expression -> grammars
expression
-> “(” expression “)” | term (”+” expression | nothing)
term
-> factor (”*” term | nothing)
factor
-> atom (”^” NUMBER | nothing)
atom
-> NUMBER (argh!, something’s missing here)
lexing
idea:
preprocess the input
humans do this when they read
* first, turn the seq of char into a sequence of words
* then try to understand the struct of the sentence based on meanings
of words
* this is called lexing
lexing: is mostly matter
of pattern matching
perl actually has special
regex features just for this purpose
tokens
sub
type{}
sub value{}
recursive descent
parsing
idea: each grammar rule becomes a
function
parsers
easy
one: nothing
others: parsers for a specific token
it’s just my notes, modified for
obvious spelling errors fixes and URLs for the interesting bits. It may
contains errors. I’ll post proper and scoped articles later.
Upate: Fixed broken links
scripting
languages
past, present and future
ruby
most direct competitor for perl
perl 6 mix between
pure scripting and programming language
lua,
applescript: niche player
failed: tcl (due to lack
of extensibility), *sh (clumsy addition of layers of features)
early
binding vs late binding
perl6: all method are virtual by
default
single dispatch, multiple dispatch
single:
python, perl5
multiple: perl6, dylan
eager
or lazy evaluation
haskell: very lazy evaluation
perl6:
scalar is eager by default, list is lazy
eager
typology, lazy typology:
types introduced in perl6 for the
multiple dispatch
fixes e.g: prototyping
removed
from perl6 punctuation that were not really necessary
introduce
a new one for scoping
mutable, immutable classes
java
classes are immutable -> fast
ruby classes are mutable
-> ruby slow
perl6 will have a mix
class
based OO, prototype based OO
perl6 will be classed based, but
meta data will allow prototype based OO
(see Moose in perl5)
perl6:
given … when … for regexp?
benefits over
Mechanize: test javascript as it runs
Thoughworks
released
as open source on openQA
use javascript and iframes
in the browser
core runs the tests and interrogates the DOM
RC
server and core communicate via AJAX
Core, Remote
Control, IDE (firefox plugin)
Core: issues with Opera
RC:
java, requires JRE version 1.5.0 or higher
experimental
support for SSL
language hooks for
java, .Net (C#),
Perl, PHP, Python, Ruby
Mozilla same origin policy
IDE:
record/playback, edit and debug tests
include Selenium Core
cpan>
install Alien::SeleniumRC
(cant
upgrade due to how versions are dealt with)
cpan>
install Test::WWW::Selenium
$
selenium-rc
use WWW::Selenium::Util
qw(server_is_running)
evolution
running
website
* servers improved
*
architecture too
* development tools improved
*
language
* templates
hard
coded html in the beginning
now templates
Template
* servers
from development on 1 server to 3-tier servers
svk with subversion
trac
* tests
* make your site faster
mod_perl (code caching)
Apache::SizeLimit
(safety net)
-> set it high
-> check it
* front/back end split
(sees Omnigraffle schema)
add caching
search result sets
individual items
lookups for info from database
lookup from external sources
put caching methods in one packages
separate cache for each backend servers?
-> share them (using memcached)
perlbal
-> load balancer/proxy
mod_gzip
cache headers (expiry)
/includes/js/<version>/common.js -> can be cached
forever
ensures user has version which matches html
use include file to update all pages
* handling images: MogileFS
* centralize
* test
* cache
* kiss (esp. perlbal)
XML
sucks (verbose, looks simple but its not)
XML schema
WSDL,
SOAP
avoid learning XML and Schema
pure
perl, compliant, complete, validating, xml message reading and writing
use
XML::Compile::Schema;
good:
automatic
name-spaces
type structures hidden (inheritance etc)
template
generator
limitation:
only
name-space based schemas
mixed content only via hooks
schemas
themselves not validated
you need a schema to use
the module
SOAP (PayLoad - all XML -, Transport - in
application)
Payload = Body + Header ( Envelope)
two
kinds of SOAP:
Document
* well defined body
*
requires longer schemas
XML-RPC
*
interface quick and dirty
* SOAP::Lite
special
* discouraged in SOAP 1.2
WSDL
message
structure and transports details are grouped together.
SOAP
client/server implementation still under construction
use
BigInt instead of sloppy int -> slight reduction in performances.
move lots of money around to
avoid interests or to gain interests
Cash management
CPAN
primary
development is outsourced
needs to customize the
product
needs to be integrated
database
web
servers
communications
high availability
monitoring
logging
archiving
deployment
initially
role is automated testing
perl as a development
language is not allowed in UBS
but perl to glue
thing together is Ok, then development could be done
Oracle
100s
of GB
* Web server
IHS:
IBM re branded version of Apache
* Communication
multiple
sources
multiple format
- message
transfer: how amount goes from what bank to what other bank in what
currency
IBM MQSeries
- mail
-
SMS
- IRC
- file transfer
pack
and unpack
use
MQSeries
; # written and maintained by Morgan Stanley people
system
handles many millions of money currency
if system
breaks, huge amount of money is lost
monitoring
-> Nagios
logs
50 GB a day
require application restart to log rotate
so
he write wrappers with named pipes, correct formatting including
timestamps
use Log::Log4Perl
Deployment:
Sun
packages
package creation
mini
CPAN burnt on CD
Extra development
internal
part base on Catalyst
(DBIx::Class,
Template
Toolkit)
Automated testing
Test::*
use Test::WWW::Selenium
trexy.com
remember
search trails
my trails - all trails - blaze a trial
30
millions incoming links
Sys::Statistics::Linux::MemStats
The
Goo
perceptrons
sensors
Tech
Pub Crawl: first Tuesday of the month in London
flag-and-bell.com
FREE BEER
network
effect
-> scaling?
temporary storage
area fro frequently accessed data can be stored for rapid access
trade
memory/disk speed
One Server:
MySQL
query cached - invalidated on write
Disk - Cache::FileCache
scales
really well
memory bound
mod_perl
only
one per child
shared memory
not as fast as
you might think
cache is separate on each
lower
hit ratio
higher miss ratio
memcached
giant
hash table distributed across machines
never
blocks
libevent
epoll/kqueue
slab
allocator
least request used
thread
per cpu (optionally)
version 1.2.x are much better
facebook:
3TB memcached
use Cache::Memcached
Pattern:
fetch
from cache
if there return
else calculate, place in cache, return
cache, not a
database
-> cant dump
-> no
persistence
-> no redundancy
-> no
access by id
-> …
time to live
smart
caching
timestamps, version number in key
cache
forever
low CPU
Failover?
doest
do it for you
replace failed server with another with same ip
or
use consistent hashing
limits:
keys: max
250 chars
values: max 1MB
Testing
*
disable memcached
future:
consistent
hashing
binary protocol
more statistics
http://www.danga.com/memcached/
has
to push the keys to all memcached servers
memcached, perlbal, mogileFS,
Djabberd,Gearman
TheSchwartz
connectors
a
connector handles the pairs of socket (one for each client)
Use?
*
escape the corporate proxy
CONNECT method
(abuse of
the the CONNECT, normally for SSL?)
* avoid
Intrusion Detection Systems
* early stage of ssh
negotiation is not encrypted and can be detected by IDS by doing a
m/ssh/
* use hooks to hide ssh signature using one Net::Proxy
before the firewall and another Net::Proxy the other side of the
firewall to decrypt ssh signature
* add SSL support
to an application that doest support it
* run two
servers on the same port
we want to run
sshd and https on the same port
* in ssh negotiation, server speaks first
* in http/ssl: client speaks first
* Net::Proxy uses that to make it possible
*
todo:
* write a connector fully
compatible with GNU httptunnel
* enhance the httptunnel protocol to support multiple connections.
* implement reverse connectors (as you cannot connect to
machines
behind firewalls at the moment)
* implement DNS tunnel connectors
* implement UDP connectors
* implement a connector that can be plugged to the STDIN/STDOUT of an
external process, like the ProxyCommand option of OpenSSH
* finish the starttls connector
* implement SOCKS connectors
Vienna, Austria:
YAPC::Europe perl conference has started and the theme is “Social Perl”
Next year’s conference will be held in Copenhagen, Denmark.
Today’s schedule is interesting and I’ll post some notes later.
It’s good to start a sunday by finding wonderful picture in your contact’s photo stream. Here, Ahmed Zahid is sharing again yet another wonderful sea-themed shot. I like the 3 three mooring lines that leads the eye to boat. I like its curves, it looks like a venetian gondola. Lighting and reflections are gorgeous as usual.